Data Management Tag Directory

Browse Alphabetically:

Featured Data Management Questions

  • comparision between water fall model and evolutionary model

    what are the advantages and disadvantages of waterfall model over evolutionay model in a software development life cycle.

    Coolsnipster0 pointsBadges:
  • Career path to becoming a big data analyst

    Hello everyone, By way of introduction, I am Sean and I am looking for a way out of my confusion as to which path to apply to become an analytics professional in the Big data field. I don't belong to the IT background but have a very deep interest in technical knowhow. I have done Business...

    kirktt2005 pointsBadges:
  • Merge two Physical Files using Logical File

    I have multiple Physical files (Trans2006, Trans2007, Trans2008, etc…) with same record layout with the yearly transactions on my iSeries. I want to merge these files (for temporary period). Is it possible to create a Logical file? If so please help me.

    nhewage50 pointsBadges:
  • MongoDB Java Connection driver

    I'm pretty new to MongoDB and I'm trying to figure out how to use the below code for its Java Connection driver. Why does it use a random number generator? if (!((_ok) ? true : (Math.random() > 0.1))) { return res; } Thanks!

    ITKE366,230 pointsBadges:
  • Computer to create a complete sequences of 2^75.000.000

    How long does it take a computer to able to create a complete sequences of 2^75.000.000?

    rarkyan25 pointsBadges:
  • How should I start my data analysis?

    I have a database which has machine data (tables like: machine cycles, machine phases and many more ) and I am suppose to analysis this data. Meaning, with the data present in different tables, I have to filter out or rather define standard machine cycles, standard machine phase etc. Hence, can you...

    rakeshmurthy5 pointsBadges:
  • How to open the .FCA file format data?

    Dear Sir/ Madam, I have a file having .fca format. I want to read, edit, print, etc. the .fca data. So what shall I do in this regard?

    herambgaikwad0125 pointsBadges:
  • Error when configuring Hadoop on CentOS

    I've been configuring Hadoop on one of our servers that's running CentOS. When I run start-dfs.sh, I keep getting this error: WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable So I'm not sure what to do here. Has anyone...

    ITKE366,230 pointsBadges:
  • ElasticSearch for real-time statistics

    We have to log millions and millions of small log documents on a weekly basis. This includes: Ad hoc queries for data mining Joining and filtering values Full-text search with Python We thought about using HBase and running Hadoop jobs to generate stat results. But the problem is that the results...

    ITKE366,230 pointsBadges:
  • What database program should we use for our big data?

    We have a bunch of text files , which range between 300 - 400 GB. Here's the format: key1 value_a key1 value_b key1 value_c key2 value_d key3 value_e .... So each line is composed by a key and value. I'm trying to create a database that would let us query all value of a key. But we're having issues...

    ITKE366,230 pointsBadges:
  • How to implement data analytics in mining industry?

    What are the value chain gaps where data analytics can be implemented in the mining industry? How to use it to the full potential?

    Samayn5 pointsBadges:
  • Working with big data in Python and low RAM

    We have to implement algorithms for 1000-dimensional  data with 200k+ data points in Python. I need to perform different operations (clustering, pairwise distance, etc.). When I try scale all of the algorithms, I run out of RAM. But here's the thing: I need to do this with several computers with...

    ITKE366,230 pointsBadges:
  • How to sort 500 GB text file in Linux

    We have a 500 GB text file that has roughly 10 billion rows that need to be sorted in alphabetical order. What's the best way to do this? A algorithm? For now, we've been using this command: LANG=C sort -k2,2 --field-separator=',' --buffer-size=(80% RAM) --temporary-directory=/volatile BigFile And...

    ITKE366,230 pointsBadges:
  • Database that’s not limited with RAM size

    For a project we're about to start, we're looking for this type of database: Non-persistent Keys of database need to be updated once in 3-6 hours Quickly select data by key DBMS Not in-memory Java support We were looking at MongoDB but it has high fragmentation costs. Redis looks good but our data...

    ITKE366,230 pointsBadges:
  • Hadoop error when accessing application through HTML

    We have installed Hadoop on our cluster and now we're installing HTTPFS to access the HDFS content using HTTP protocol. We're able to access the normal page but when we tried to access HDFS, we're getting an error: {"RemoteException":{"message":"User: ubantu is not allowed to impersonate ubantu",...

    ITKE366,230 pointsBadges:
  • What’s faster for big data processing: MongoDB or Redis

    I'm currently working on my big data project and I'm trying to decide between Redis or MongoDB. Which one would be faster for processing (from a performance standpoint)? I would appreciate any advice available.

    ITKE366,230 pointsBadges:
  • Output results of Hive query into CSV file

    We're trying to put the results of a Hive query into a CSV file. Here's the command we came up with: insert overwrite directory '/home/output.csv' select books from table; So, when it's done, it says completed but we can't find the file. Where is it? Or should we extract it in a different way?

    ITKE366,230 pointsBadges:
  • How to parse big data JSON file

    I have a JSON file that has roughly 36 GB and I need to access it more efficiently. I've been using rapidjsons SAX-style API in C++ but it takes about two hours to parse. Now here's my question: Should I split the big file into millions of small files? Is there any other approach I should take?...

    ITKE366,230 pointsBadges:
  • Speed up data processing in MongoDB

    I've been using MongoDB to get every document in a collection. It's working but with so many small documents (there are over a 100 million), it's very slow. Here's what I'm using: count <- mongo.count(mongo, ns, query) cursor <- mongo.find(mongo, query) name <- vector("character", count)...

    ITKE366,230 pointsBadges:
  • Out of memory error when installing Hadoop

    I recently tried to install Hadoop following a document my friend gave me. When I tried to execute this: bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' I got this exception: java.lang.OutOfMemoryError: Java heap space Has anyone seen this before? Like I said, I'm pretty new to...

    ITKE366,230 pointsBadges:

Browse Alphabetically:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Thanks! We'll email you when relevant content is added and updated.

Following