• How should I start my data analysis?

    I have a database which has machine data (tables like: machine cycles, machine phases and many more ) and I am suppose to analysis this data. Meaning, with the data present in different tables, I have to filter out or rather define standard machine cycles, standard machine phase etc. Hence, can you...

    rakeshmurthy5 pointsBadges:
  • How to open the .FCA file format data?

    Dear Sir/ Madam, I have a file having .fca format. I want to read, edit, print, etc. the .fca data. So what shall I do in this regard?

    herambgaikwad0125 pointsBadges:
  • Error when configuring Hadoop on CentOS

    I've been configuring Hadoop on one of our servers that's running CentOS. When I run start-dfs.sh, I keep getting this error: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable So I'm not sure what to do here. Has anyone...

    ITKE376,350 pointsBadges:
  • ElasticSearch for real-time statistics

    We have to log millions and millions of small log documents on a weekly basis. This includes: Ad hoc queries for data mining Joining and filtering values Full-text search with Python We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results...

    ITKE376,350 pointsBadges:
  • What database program should we use for our big data?

    We have a bunch of text files , which range between 300 - 400 GB. Here's the format: key1 value_a key1 value_b key1 value_c key2 value_d key3 value_e .... So each line is composed by a key and value. I'm trying to create a database that would let us query all value of a key. But we're having issues...

    ITKE376,350 pointsBadges:
  • Small business KPI’s in startup

    What are the classic KPI's for small business in each phase of the business life cycle, specifically in the start-up phase?

    Cmichaud5 pointsBadges:
  • How to implement data analytics in mining industry?

    What are the value chain gaps where data analytics can be implemented in the mining industry? How to use it to the full potential?

    Samayn5 pointsBadges:
  • Working with big data in Python and low RAM

    We have to implement algorithms for 1000-dimensional  data with 200k+ data points in Python. I need to perform different operations (clustering, pairwise distance, etc.). When I try scale all of the algorithms, I run out of RAM. But here's the thing: I need to do this with several computers with...

    ITKE376,350 pointsBadges:
  • How to sort 500 GB text file in Linux

    We have a 500 GB text file that has roughly 10 billion rows that need to be sorted in alphabetical order. What's the best way to do this? A algorithm? For now, we've been using this command: LANG=C sort -k2,2 --field-separator=',' --buffer-size=(80% RAM) --temporary-directory=/volatile BigFile And...

    ITKE376,350 pointsBadges:
  • Database that’s not limited with RAM size

    For a project we're about to start, we're looking for this type of database: Non-persistent Keys of database need to be updated once in 3-6 hours Quickly select data by key DBMS Not in-memory Java support We were looking at MongoDB but it has high fragmentation costs. Redis looks good but our data...

    ITKE376,350 pointsBadges:
  • Hadoop error when accessing application through HTML

    We have installed Hadoop on our cluster and now we're installing HTTPFS to access the HDFS content using HTTP protocol. We're able to access the normal page but when we tried to access HDFS, we're getting an error: {"RemoteException":{"message":"User: ubantu is not allowed to impersonate ubantu",...

    ITKE376,350 pointsBadges:
  • What’s faster for big data processing: MongoDB or Redis

    I'm currently working on my big data project and I'm trying to decide between Redis or MongoDB. Which one would be faster for processing (from a performance standpoint)? I would appreciate any advice available.

    ITKE376,350 pointsBadges:
  • Output results of Hive query into CSV file

    We're trying to put the results of a Hive query into a CSV file. Here's the command we came up with: insert overwrite directory '/home/output.csv' select books from table; So, when it's done, it says completed but we can't find the file. Where is it? Or should we extract it in a different way?

    ITKE376,350 pointsBadges:
  • How to parse big data JSON file

    I have a JSON file that has roughly 36 GB and I need to access it more efficiently. I've been using rapidjsons SAX-style API in C++ but it takes about two hours to parse. Now here's my question: Should I split the big file into millions of small files? Is there any other approach I should take?...

    ITKE376,350 pointsBadges:
  • Speed up data processing in MongoDB

    I've been using MongoDB to get every document in a collection. It's working but with so many small documents (there are over a 100 million), it's very slow. Here's what I'm using: count <- mongo.count(mongo, ns, query) cursor <- mongo.find(mongo, query) name <- vector("character", count)...

    ITKE376,350 pointsBadges:
  • Out of memory error when installing Hadoop

    I recently tried to install Hadoop following a document my friend gave me. When I tried to execute this: bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' I got this exception: java.lang.OutOfMemoryError: Java heap space Has anyone seen this before? Like I said, I'm pretty new to...

    ITKE376,350 pointsBadges:
  • JAVA_HOME is not set correctly when installing Hadoop on Ubuntu

    I've been trying to install Hadoop on Ubuntu 11.10. I just set the JAVA_HOME variable in the file conf/hadoop-env.sh to: # export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk Then I tried to execute these commands: $ mkdir input $ cp conf/*.xml input $ bin/hadoop jar hadoop-examples-*.jar grep input...

    ITKE376,350 pointsBadges:
  • Available material to help teach Data Management

    I work at Houston Community College and am scheduled to teach a Data Management course, this Summer. Is there any available material that I can use as resource I would be very appreciative.

    enriquej5 pointsBadges:
  • Case-insensitive query in MongoDB

    Does anyone know if it's possible to make a case-insensitive query in MongoDB? Something like this: > db.stuff.save({"foo":"bar"}); > db.stuff.find({"foo":"bar"}).count(); 1 > db.stuff.find({"foo":"BAR"}).count(); 0 Thanks!

    ITKE376,350 pointsBadges:
  • Hadoop: What’s the difference between Pig and Hive?

    I'm pretty new to the Hadoop world (been using it for about a month) and I've started to get into Hive, Pig and Hadoop using Cloudera's Hadoop VM. Is there a difference between Pig and Hive? I understand they have similar commands so I'm trying to figure out the big differences.

    ITKE376,350 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.