Tag Archives: pandas

Discovering Pandas (Python)

Py_data_AnalSo I’ve added up another book to my Python library. “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython”.

So far, it promises to be a good introduction to data science using Python tools. – The first thing you need to do to be able to follow with the exercises in the book is get pandas installed. Which is easy. The link is right here.

Pandas leverages Numpy, which I was familiar with in that I had heard of it. But I didn’t realize the significance of Numpy in terms of user community or the size of the package. In fact, I assumed it might ship as part of the standard library. Wrong!

But you don’t have to install Pandas or Numpy separately. I’ve just discovered pip. Again, likely due to the fact that I spend most of my Python time in the Standard Library. Once you have pip running with your Python installation of preference, getting new sotware is as easy as running “sudo apt-get install” in Linux.

You just type “pip install pandas” and let it download the necessary packages from the web and install them locally. If you’ve already downloaded a Python wheel (WHL), you can also point pip at that and install from the local file. For more about Wheels, which are replacing Eggs, go here.

Trying to install pandas though, I was at first getting an error related to “Windows C++ 10.0” (link). That turned out to be due to my trying to install 32-bit Numpy. So I did end up downloading a 64-bit version for my 64-bit Python 3.3 and then used pip to install from that WHL. The whole installation took no more than 10 minutes. Now, I’m ready to play with pandas.

numpy_via_pip

Advertisements

Leave a comment

Filed under Uncategorized

Python & Big Data Analytics (some good links)

I am reading a lot about the use of Python in the data management arena, and while I am not currently working with Big Data, I thought this article here – Using Python for Big Data Analytics – had some great information on things to avoid in Python.

Here is an article that lists Python among the best languages for crunching data (spells “Big Data”). It also mentions a number of other languages, I am not familiar with at all – Kafka? I must be living on the dark side of the moon.

Finally, to complete the triad of links for sharing, there is a page with some pandas how-to. Pandas is another framework I need to take a look at. It seems to pop up in data analytics everywhere. In fact, O’Reilly has a number of titles on Python in that sphere and this one touches on pandas, e.g. Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. I should pick up a copy quick… Data science sounds like job security.

Leave a comment

Filed under Uncategorized