I am working on a research project in large data mining. I have written the code to organize the data in a dictionary, however, the amount of data is so large that when I create a dictionary, my computer runs out of memory, from time to time, I have used my dictionary in main memory There is a need to write and make many dictionaries in such a way. I then need to compare the resulting number of dictionaries, update the keys and values accordingly and store the whole thing in a big dictionary on the disk. Any ideas how can I do in a dragon? I need an API which can quickly write a dict on the disk and then compare 2 dix and update keys. I can actually write code to compare 2 dcters, this is not a problem, but I need to get out of memory ..
My dick looks like this: "Orange": ["This is a fruit", "It is very tasty", ...]
Hoffman's Agree with: Get a database of connection to Data-processing is an unusual task for relational engines, but believe that this is a good agreement between use / deployment and speed for large datasets.
I usually use sqlite3, which comes only with Python, although more often I use it using the advantage of a relational engine such as sqlite3 is that you have it with your data Together, can be guided through many processing and updates, and it will take care of swapping all data / disk data needed in a very sensible manner. You can also use in-memory databases to capture small data, which you need to interact with your large data, and they are linked through the "ATTACH" statement. I have processed gigabytes like this
Comments
Post a Comment