Data Frames

The DataFrame.py class, posted by Andrew Straw on the scipy-user mailing listoriginal link, is an extremely useful tool for using alphanumerical tabular data, as often found in databases. Some data which might be ingested into a data frame could be:

ID	LOCATION	VAL_1	VAL_2
01	Somewhere	0.1	0.6
02	Somewhere Else	0.2	0.5
03	Elsewhere	0.3	0.4

The DataFrame.py class can be populated from data from a CSV file (comman-separated values). In its current implementation, these files are read with Python's own CSV module, which allows for a great deal of customisation.

Example Usage

A sample file CSV file from Access2000 is in CSVSample.csv . We first import the module:

   1 import DataFrame

and read the file in using our desired CVS dialect:

   1 df=DataFrame.read_csv ("CSVSample.csv",dialect=DataFrame.access2000)

(note that the dialect is actually defined in the DataFrame class). It is often useful to filter the data according to some criterion.

Compatibility with Python 2.6 and above

Starting with Python 2.6, the sets module is deprecated, in order to get rid of the warning, replace

   1  imports sets

with

   1  try:
   2      set
   3  except NameError:
   4      from sets import Set as set

Then replace all instances of sets.Set() with set().

CategoryCookbook CategoryCookbook