The PyData conference and workshop is a semi-annual event for scientists, engineers, and data analysts in the Python community. The conference focuses on techniques and tools for management, analytics, and visualization of data of different types and sizes with particular emphasis on big data.
- Number of videos:
In this video tutorial from the 2012 PyData Workshop, John Hunter, author of matplotlib is going to give you some advanced insight into the plotting library.
Coming from the 2012 PyData Workshop, Wes McKinney, CTO and cofounder of Lambda Foundry, gives us a tour of Pandas, a rich data manipulation tool built on top of NumPy. Frustrated with working in R, Wes started building Pandas in 2008 with a focus on fast, intuitive data structures and data manipulation capabilities. The Pandas project has seen huge growth in the last few years, and aims to be the ultimate data tool for Python.
In this video tutorial from PyData Workshop, scikit-learn contributer Jacob VanderPlas is going to give you an overview of machine learning in Python using scikit-learn.
He'll talk about general machine learning concepts, as well as walk you through a few exercises that demonstrate how you can use machine learning technology.
In this video from the 2012 PyData Workshop, Stéfan van der Walt is going to give you an in-depth look at scikits-image, the image processing toolbox for Python.
Stefan will talk about the wide application for image processing in industry, and demonstrate how you can use scikits-image to solve specific industry problems.
Travis Oliphant, CEO of Continuum Analytics, kicks off the PyData Workshop with a talk on Python in Big Data. Topics addressed include what Python has to offer the world of Big Data, specific use-cases, as well asking why Hadoop is considered the de-facto standard.
Additionally, Travis gives an overview of NumPy and SciPy.
Chris Mueller from Life Technologies introduces us to Disco, a MapReduce framework built in Python and Erlang.
Showing that Hadoop is not alone in the MapReduce world, Chris reviews the basic MapReduce paradigm, dataflow, file and job distribution, and goes on to explain the Disco Distributed Filesystem (DDFS) before going into some use- case scenarios in next generation genomic sequencing.
In this presentation from the 2012 PyData Workshop hosted at Google on March 2-3, Guido van Rossum, author of the Python programming language, engages in an open discussion on the intersection of the evolution of Python and the growth of the scientific community. Panelists include Fernando Perez, Travis Oliphant, and David Cournapeau.