• Is it an option to extract data from SAP MM using ETL tools such talend, SSIS or PowerCenter in order to load data into Oracle DW?

    Instead of using BW or Hanna, is it a good alternative to design and populate an Oracle Data Warehouse, using ETL tools such Talend, SSIS or PowerCenter to extract data from SAP Modules? Any suggestion about an outstanding open source solution of query & BI tools to access this Oracle DW?

    MarceloMSP5 pointsBadges:
  • NoSQL Database(s)

    I am knew to this tech. I understand NoSQL is not relational. Could old COBOL file structures or such systems be considered as NoSQL type since we could put different record structures with variable lengths (dependent on array values) and we could design a UI as a record (incorporating arrays as...

    Chegutu465 pointsBadges:
  • MongoDB Java Connection driver

    I'm pretty new to MongoDB and I'm trying to figure out how to use the below code for its Java Connection driver. Why does it use a random number generator? if (!((_ok) ? true : (Math.random() > 0.1))) { return res; } Thanks!

    ITKE440,550 pointsBadges:
  • Error when configuring Hadoop on CentOS

    I've been configuring Hadoop on one of our servers that's running CentOS. When I run start-dfs.sh, I keep getting this error: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable So I'm not sure what to do here. Has anyone...

    ITKE440,550 pointsBadges:
  • ElasticSearch for real-time statistics

    We have to log millions and millions of small log documents on a weekly basis. This includes: Ad hoc queries for data mining Joining and filtering values Full-text search with Python We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results...

    ITKE440,550 pointsBadges:
  • What database program should we use for our big data?

    We have a bunch of text files , which range between 300 - 400 GB. Here's the format: key1 value_a key1 value_b key1 value_c key2 value_d key3 value_e .... So each line is composed by a key and value. I'm trying to create a database that would let us query all value of a key. But we're having issues...

    ITKE440,550 pointsBadges:
  • Small business KPI’s in startup

    What are the classic KPI's for small business in each phase of the business life cycle, specifically in the start-up phase?

    Cmichaud5 pointsBadges:
  • How to implement data analytics in mining industry?

    What are the value chain gaps where data analytics can be implemented in the mining industry? How to use it to the full potential?

    Samayn5 pointsBadges:
  • Working with big data in Python and low RAM

    We have to implement algorithms for 1000-dimensional  data with 200k+ data points in Python. I need to perform different operations (clustering, pairwise distance, etc.). When I try scale all of the algorithms, I run out of RAM. But here's the thing: I need to do this with several computers with...

    ITKE440,550 pointsBadges:
  • How to sort 500 GB text file in Linux

    We have a 500 GB text file that has roughly 10 billion rows that need to be sorted in alphabetical order. What's the best way to do this? A algorithm? For now, we've been using this command: LANG=C sort -k2,2 --field-separator=',' --buffer-size=(80% RAM) --temporary-directory=/volatile BigFile And...

    ITKE440,550 pointsBadges:
  • Database that’s not limited with RAM size

    For a project we're about to start, we're looking for this type of database: Non-persistent Keys of database need to be updated once in 3-6 hours Quickly select data by key DBMS Not in-memory Java support We were looking at MongoDB but it has high fragmentation costs. Redis looks good but our data...

    ITKE440,550 pointsBadges:
  • Hadoop error when accessing application through HTML

    We have installed Hadoop on our cluster and now we're installing HTTPFS to access the HDFS content using HTTP protocol. We're able to access the normal page but when we tried to access HDFS, we're getting an error: {"RemoteException":{"message":"User: ubantu is not allowed to impersonate ubantu",...

    ITKE440,550 pointsBadges:
  • What’s faster for big data processing: MongoDB or Redis

    I'm currently working on my big data project and I'm trying to decide between Redis or MongoDB. Which one would be faster for processing (from a performance standpoint)? I would appreciate any advice available.

    ITKE440,550 pointsBadges:
  • Output results of Hive query into CSV file

    We're trying to put the results of a Hive query into a CSV file. Here's the command we came up with: insert overwrite directory '/home/output.csv' select books from table; So, when it's done, it says completed but we can't find the file. Where is it? Or should we extract it in a different way?

    ITKE440,550 pointsBadges:
  • How to parse big data JSON file

    I have a JSON file that has roughly 36 GB and I need to access it more efficiently. I've been using rapidjsons SAX-style API in C++ but it takes about two hours to parse. Now here's my question: Should I split the big file into millions of small files? Is there any other approach I should take?...

    ITKE440,550 pointsBadges:
  • Speed up data processing in MongoDB

    I've been using MongoDB to get every document in a collection. It's working but with so many small documents (there are over a 100 million), it's very slow. Here's what I'm using: count <- mongo.count(mongo, ns, query) cursor <- mongo.find(mongo, query) name <- vector("character", count)...

    ITKE440,550 pointsBadges:
  • Out of memory error when installing Hadoop

    I recently tried to install Hadoop following a document my friend gave me. When I tried to execute this: bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' I got this exception: java.lang.OutOfMemoryError: Java heap space Has anyone seen this before? Like I said, I'm pretty new to...

    ITKE440,550 pointsBadges:
  • JAVA_HOME is not set correctly when installing Hadoop on Ubuntu

    I've been trying to install Hadoop on Ubuntu 11.10. I just set the JAVA_HOME variable in the file conf/hadoop-env.sh to: # export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk Then I tried to execute these commands: $ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-examples-*.jar grep input...

    ITKE440,550 pointsBadges:
  • Case-insensitive query in MongoDB

    Does anyone know if it's possible to make a case-insensitive query in MongoDB? Something like this: > db.stuff.save({"foo":"bar"}); > db.stuff.find({"foo":"bar"}).count(); 1 > db.stuff.find({"foo":"BAR"}).count(); 0 Thanks!

    ITKE440,550 pointsBadges:
  • Hadoop: What’s the difference between Pig and Hive?

    I'm pretty new to the Hadoop world (been using it for about a month) and I've started to get into Hive, Pig and Hadoop using Cloudera's Hadoop VM. Is there a difference between Pig and Hive? I understand they have similar commands so I'm trying to figure out the big differences.

    ITKE440,550 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.