PyCon AU 2013
Andrew Rowe will present the lessons learnt and techniques used to process very large amounts of data from the ABS Census. The Australian Bureau of Statistics used Python to investigate data from the 2006 Australian Census. Python is an integral part of ABS systems to determine duplicated entries and link people in the Census to other ABS collections. You will learn about: Handling large data. Dealing with confidentiality. Multiprocessing techniques. Performance tips and tricks. * Difference between if( 1 < 2 ) and if 1 < 2.
A Continuous Integration (CI) server is an essential tool for any developer, but with so many different servers out there its hard to choose which one to use. Buildbot has a pretty steep learning curve, but rewards with a very light footprint and amazing flexibility and configurability. In this talk I will walk through the build(ing) blocks and concepts required to put together a simple CI server based upon buildbot, and also suggest some more advanced features.
Reporting and analysis systems rely on coherent and reliable data, often from disparate sources. To that end, a series of well established data warehousing practices have emerged to extract data and produce a consistent data store.
This talk will look at some options for composing workflows using Python. In particular, we'll explore beyond Celery's asynchronous task processing functionality into its workflow (aka Canvas) system and how it can be used in conjunction with SQLAlchemy's architecture to provide the building blocks for data stream processing.
Cloud computing is changing the way that businesses think about their computing requirements. Instead of ordering hardware, waiting for delivery, allocating space in a data center, installing and wiring it up, and then configuring each piece of the system, you can now do the equivalent with a few clicks on a control panel, but that gets tedious. What's much more interesting is to do all of this programmatically, using our favorite language: Python!
This session will deep-dive into this topic by using pyrax, the Python SDK for the OpenStack and Rackspace Clouds. It will cover the following:
• Getting your cloud account and credentials • Installing pyrax • Creating servers • Saving a customized server as a template image • Creating more servers on demand from your template images. • Creating, attaching, and imaging Block Storage volumes. • Using private networks to create a bastion host setup • Managing these servers with a Load Balancer • Creating and managing Cloud Databases • Using pyrax to manage your DNS • Object storage and management using pyrax
The IPython Notebook is a powerful web app for exploring ideas and data sets with Python. It has excellent integration with Matplotlib, giving the user highly customisable static plots with ease. But for larger data sets, a static plot may not be ideal - the ability to pan, zoom, choose dynamic layers and sample the data at particular points would be nice. This talk will demonstrate just how easy it is to integrate a Web Map Service/client such as Pydap/Leaflet.js into the IPython Notebook.
Stack Overflow is the single greatest repository of coding knowledge in the world. Now approaching five years old, its community-moderated, strict Q&A format has made it far more useful than other similar sites.
Contributing to it, however, can be intimidating. The questions of new users are often voted down or closed with little comment, or edited by the community in was the original poster doesn't understand. Answering is even worse. How do other people manage to post a detailed answer to a specific question in minutes, or even seconds? How could I possibly know so much about such a broad range of topics, even within a single programming language or framework?
Over the past three years, I've learned a lot from contributing to Stack Overflow. It has honed my research, technical writing, and rapid prototyping skills, as well as greatly expanded my knowledge of the Python ecosystem, standard library, and CPython internals.
I'll share not only how to use the site to learn, but also how to compete effectively with the thousands of other programmers who answer questions there on a daily basis -- we all like to win. I'll talk about what goes into a good answer, as well as a good question. I'll also talk about how contributing to Stack Overflow is like contributing to any other open source project in many ways -- in what you gain, as well as what the community gains, partly because of the CC-by-SA licensing used by the Stack Exchange network of sites.
This talk will be an in-depth review of the stages that most open source projects go though, and the decisions their maintainers face. Requests will be used as an example — lessons learned and best practices will be covered.
Have you ever wanted to write a game? This talk will give you all the stuff that you will need to make a game using the pygame game making library. Come to this talk so you don't make all the mistakes that I did. This talk will cover all kinds of awesome things that your game will need like images, music, sound effects, drawing, and all the vital things you will need to know. And I will show you how to do it right!
We spent the first part of our careers developing products and managing software teams at IBM, Dell, and Texas Instruments. After moving to Singapore, we each found are way into the education space and found ourselves teaching Python to a variety of audiences. We developed our first Python game for a friend of ours who was teaching computer science in a local high school. Since then, it has been a continual journey to see how fun we could make learning python before our now 10-year old daughter was ready to pickup the keyboard and start helping out. In this talk, we will talk about where our journey started, where we are today, and where we hope to be in the near future. We will discuss the Tournament-based Teaching methodology that we developed to increase student preparation, in-class engagement, and peer-based learning. Then we will talk about our research in to using mobile devices such as the iPad to help students practice reading and working with python to gain confidence before moving on to writing code. We will discuss our latest family project where we have worked with our children to create a fun python quest to encourage learners around the world to practice their python for just a few minutes more each week. Then looking forward, we will talk about our latest research in to pair-based programming assignments, pair-programming tournaments, team-based logic competitions, encouraging more girls to try programming, and the goal to frame programming as a very healthy habit. Our hope is that this talk will be interesting for educators looking to increase student engagement and interest, for managers looking for fun ways to increase the productivity of their development teams, and for parents looking for additional ways to enable their children to experiment with python.
Writing scientific software in support of experimentation and simulation is a challenging task. It is even more challenging in cases where such software must be distributed across multiple machines. Existing methods for addressing this problem can require either significant effort to maintain and extend. Alternative approaches such as message queues can be incredibly difficult to install for novices.
This presentation will demonstrate a quick and easy approach to solving this problem using the redis-queue module. This approach makes it easy to make efficient use of multiple cores and multiple machines, with only minimal dependence on external packages.
This is a tutorial on using the latest and most exciting tools in Python for scientific and engineering applications in 2013, with a focus on 'big data' applications. Using real-world data sets and a fully Python 3 environment, it will walk you through what's possible with modern tools like the machine-learning package scikit-learn, the image-processing package scikit-image, the Pandas toolkit for data analysis, and IPython-parallel. It will also review the upcoming generation of tools like Numba and Blaze.
Indie game developer Luke Miller presents a brief overview on making point-and-click adventure games using the open source pyvida gaming engine and uses his commercially released gay-themed adventure game "My Ex-Boyfriend the Space Tyrant!" as a tutorial on developing, packaging, releasing and selling a python game for Windows, Mac and Linux.
Have you ever found yourself obsessively checking the UPS or FedEx tracking site to see if your package finally got delivered at your doorstep? Or wondered when your contractor/gardener showed up to do their job? Or if your neighbor came looking for you on an urgent matter while you were out?
In this talk, I will show you how you can relax and rely on your handy-dandy smartphone to let you know when these events happen along with video snippets of what happened and who showed up!
Brett covers the things that excite him about each of Python, Ruby and Go. He covers some of cool stuff he's seen lately, some of the lessons learned from different ecosystems, and more specifically where he thinks Python could improve, or what it excels at.
By using a variety of techniques and technologies, you can tap into the expert knowledge of others more effectively. Revision control and code reviews are great for software quality, but not everyone is going to work that way. Tools such as ipython notebook, sharing gists, demonstration sessions and screencasts are a great way to get others involved in problem solving. Knowing how to use these tools quickly and effectively can also be a great way to explain problems to management, or to walk them through a complex requirement.
For when things should be pluggable... for when you want users to be able to extend your app... what should you do? "I will write a framework!", I hear you say. No need to invent your own pluggable system - use Component Architecture! The framework for frameworks (tested on animals).
Is Django a CMS? Is Plone a Framework? For years I've seen developers stumble over the same problem, when should I use a CMS like Plone, Drupal or Wordpress, and when should I use a framework like Django? Covering real case studies of projects gone horribly wrong through the use of the wrong technology. The audience will be left with several solid rules to follow to guide their future development.
Tkinter - the Python wrapper to the Tk graphics library - has been part of the Python standard library since very early on. However, that inclusion hasn't translated into extensive use.
There was a very good reason for this. Tk's documentation was beyond awful. And if you managed to get over that hurdle, Tkinter apps looked awful - they had a woefully inadequate set of widgets, styled with the very best of mid 1990's open source graphic skill.
And then, the world got obsessed with web frameworks, and the desktop was declared as dead.
However, in the last few years, many of the reasons Tkinter was ignored have been quietly fixed. Tk 8.4 massively improved the visual appearance of Tk. tkdocs.com has emerged, addressing many of the problems with Tk documentation.
In this talk, you'll get a re-introduction to an old friend, and an explanation of why, in a web and mobile world, you should care.
With Andrew MacDonald and Daehyok Shin
The Australian Bureau of Meteorology provides water availability forecasts to the public and to key stakeholders at different time-scales across the nation. A number of the systems driving these forecasts make extensive use of Python. Python is used throughout the forecasting process - from data ingestion and management, to hydrological modelling and data analysis through to graphical product generation. A wide variety of packages are used heavily. These include NumPy, SciPy, Matplotlib, PyTables and Pandas. Such a suite of scientific computing packages for Python enables us to complete the development of fully automated systems quickly even with limited resources.
This presentation will give an overview of the systems used by the Bureau in the generation of water availability forecasts and highlight the wide variety of tasks and processes enabled by Python. In particular, we will introduce the Hydrologic Reference Stations (HRS) toolkit and the Water Availability Forecasts for Australian Rivers (WAFARi) system. The HRS toolkit analyses time-series of streamflow data and produces a huge number of products describing mean state, trends and variability in that data, which are released at http://bom.gov.au/water/hrs. WAFARi is used to generate probabilistic seasonal streamflow forecasts along with a suite of graphical products for each of those forecasts. The system is being used to update streamflow forecasts available at http://bom/gov.au/water/ssf every month.
The web is a scary place, and building secure web applications is difficult. Luckily, you've got Python! The Python web community tends to take security seriously, so most popular Python web frameworks have defenses available. This talk looks at the list of the top 10 security vulnerabilities, as ranked by The Open Web Application Security Project (OWASP). We'll talk about what each attack is, and look at how to defend against them using three popular Python web frameworks -- Django, Pyramid, and Flask.
Classifying what type of job programmers do can be a challenge. Are we engineers? Are we scientists? Craftspeople? Something else entirely? Are software engineers, software developers, software architects, and programmers all really the same thing? This talk explores the nature of our work, and its relationship to the scientific method, including a dive into epistemology.
Starting with 13.5 million tweets from 2011 containing the word science: How I’ve explored the way people use ‘science’ on Twitter. The IPython notebook (http://ipython.org/notebook.html) is a great tool for research, allowing me to keep notes about my research interleaved with the python code. In addition, the Pandas python data analysis library (http://pandas.pydata.org/) supports working with large data tables with excellent support for time series data.
A guide to metaprogramming & magic methods -- we will look at practical examples from open source software, understand common patterns and learn to yield such powerful constructs to our advantage, instead of fearing them.
This talk explores the challenges of ensuring responsiveness of applications under varying conditions like suddenly increased load, code regressions and problematic user data that reveal code paths with unusually high time complexity.
I'll be looking at interrupt-driven techniques to help bring the 95 percentile of the response times of your application closer to the (usually much lower) mean.
In this talk, we'll go beyond traditional tricks like caching, sharding and data denormalization and instead look at tools that can interrupt execution of overly expensive code paths, such that you can guarantee an upper bound in response time.
Interruptingcow and django-timelimit will be some of the tools that will be covered in this talk.
The context for most of this is web applications, and I'll be drawing many examples from our ongoing experiences with running and scaling Bitbucket, which is entirely written in Python.
Having said that though, many of the tools and techniques demonstrates will apply just as well to other types of applications and situations.
This introductory tutorial will teach you how to effectively use modules and packages so your code is easier to read, test, package, deploy, reuse and maintain.
We will cover the basics of structuring your code with modules and packages, ways of using the import statement, how to document modules and packages, and a number of tips to ensure your code is less likely to end up a tangled mess that collapses when you need to modify or extend it.
This talk is compatible with Python 2 and 3.
These are interesting times for the Python packaging ecosystem, with the Python Packaging Authority (creators of the popular pip and virtualenv tools) emerging as the umbrella brand for a suite of related tools that will bring support for updated packaging standards to both the upcoming Python 3.4 and to existing versions of Python.
This talk will cover some of the history of Python's packaging tools and systems, where we are now, and where we aim to go in the future.
Self taught in Python? Think you missed a bit? This 90 minutes will fix everything. From a really quick recap of the bare-bones essentials, you'll get a good grasp and the core of Python. Want to know more about classes, objects and more? This is for you. Also suitable for beginners who need a quick start. Recap of the essentials Understanding the object model Everything you've ever wanted to know about dictionaries Building on types Creating your own types - Object Oriented Programming Important elements from the Standard Library And lots of time for questions.
One of the first things you learn in programming is to apply a series of instructions to a set of elements.
Given it's ubiquitous nature and given the culture of python to simplify such tasks, decades of development and thought has gone into making this as convenient as possible for all possible use cases. While the functional ways of iterating like map, reduce and filter exist, list-comprehensions and the functions in itertools module enable several pythonic imperative alternatives that cover a gamut of possible use cases.
This talk is an interactive exercise using iPython notebook to cover many of these use cases to enable better iteration including writing of your own iterators and generators and when you would want to do such a thing.
In the immortal words of that modern day philosopher Homer Jay Simpson: "Can't Someone Else Do It?"
If you're too lazy to configure your own servers: let Salt do it for you. Salt is an open source configuration management tool like Chef or Puppet but written in Python using ZeroMQ.
If you're too lazy to login to your servers to run commands: let Salt do it for you. Salt is also a remote execution system. A single command run from the "master" salt server can call tens, hundreds, or
thousands of remote servers.
And if you're too lazy to install both an OS and Salt itself: let Salt do it for you. Salt Cloud can spin up new boxes for you in the cloud, install Salt on them, and introduce them to the "master" salt server.
Salty vagrants, masters, minions, states, pillars, grains, salt clouds, parallel execution: I'll attempt to touch on them all in this talk.
You can also be expected to be "assaulted" with a barrage of terrible salt themed puns.
Social app development challenges us to code for users’ personal world. Users are giving push-back to ill-fitted assumptions about their own identity — name, gender, sexual orientation, important relationships, and many other attributes that are individually meaningful.
How can we balance users’ realities with an app’s business requirements?
Facebook, Google+, and others are struggling with these questions. Resilient approaches arise from an app’s own foundation. Discover how our earliest choices influence codebase, UX, and development itself. Learn how we can use that knowledge to both inspire the people who use our apps, and to generate the data that need as developers.
The scikit-learn library is a rapidly growing open source toolkit for machine learning in python. It allows for practitioners and researchers to apply machine learning in a variety of applications and is used by companies worldwide. Developed by programmers from around the world, the project has a large (and increasing) number of machine learning algorithms, a very useful set of utility functions and has also spawned a set of detailed tutorials. Written in python with the aid of numpy, scipy and cython, this library is featured, fast and extensible.
In this talk, I will introduce scikit-learn, giving an overview of the library, its features and how to use it for a number of different applications. Next talk about some of the tutorials that are actively being developed for learning machine learning and scikit-learn and also how to contribute. Through this, I'll introduce some key machine learning concepts and how you can apply them to a variety of tasks. The focus will be on practical uses, rather than theoretical advances.
To end the presentation, I'll briefly overview the research I perform at the Internet Commerce Security Laboratory (University of Ballarat) in cybercrime attribution, where I work with our industry partners to disrupt cybercrime. While it can be very difficult to do direct network based attribution, indirect methods through criminal profiling may assist in stopping crimes such as phishing or online fraud. I'll walk through some of our results in identifying the size and scope of the operations behind some of these attacks.
Scientists are required to do more and more programming these days, however they are almost always self-taught. They spend a large percentage of their time wrestling with software, reinvent a lot of wheels, and still don’t know if their results are reliable. For proficient software developers, particularly those employed to assist these scientists, this computational illiteracy is easy to identify, but much harder to remedy. What do you teach a scientist, given their time constraints, background knowledge and work requirements? This presentation will give an answer to that question, by describing what gets taught at a Software Carpentry boot camp.
Software Carpentry is a volunteer organisation that has been teaching basic software development to scientists for over a decade. Supported by the Sloan Foundation and Mozilla, Software Carpentry has refined its teaching methods over time and has recently settled on a model centred around the delivery of intensive short courses known as “boot camps”. Institutions such as Oxford University, MIT, University of California Berkeley and the International Centre for Theoretical Physics have all hosted boot camps in the past 12 months.
Dr. Greg Wilson, the founder of Software Carpentry, flew out to Australia in February 2013 to host a series of two day boot camps with the Genes to Geoscience Research Centre in Sydney and the Australian Meteorological and Oceanographic Society (AMOS) and ARC Centre for Excellence in Climate System Science in Melbourne. These two events were the first ever boot camps held outside of North America and Europe, were booked out in a matter of days, and received rave reviews from participants. Given this success, there are already plans afoot to host a pair of boot camps for the bioinformatics community in Brisbane and Adelaide later this year, plus another for the weather/climate community in Hobart.
This presentation will describe what gets taught at a Software Carpentry boot camp (hint: there's lots of Python), how it's taught and why it's so effective (hint: we take an evidence-based approach to teaching). All Software Carpentry teaching materials can be freely re-used under open access licenses, so at the conclusion of the talk you can take anything you've learned and apply it to your own work and/or teaching. Better still, you'll find out how to join the Australian Software Carpentry team, so that you can organise and/or teach a boot camp for your own discipline.
A good suite of tests is a programmer's best friend. But a poorly designed testing strategy can result in tests that are unwieldy, fragile, and above all no fun to write or maintain.
This talk will offer a smorgasbord of testing ideas, ranging from simple tips and tricks to fully-fledged testing frameworks for Python. We will touch on automation, table-based testing, round-trip testing, UI testing, property checking, testing Twisted Python applications, and more. Be prepared to think outside the box, as we will see that testing is an area where creativity and pragmatism pay off better than following a rigid set of rules.
Everyone should come away with a few new ideas, and renewed enthusiasm for test-driven development!
So you've built an awesome webapp, put it through its paces, and assured yourself that it does what it's supposed to do. Great! Now how does it behave when things start to go wrong?
This talk will demonstrate how the Mozilla Services team tests for failure scenarios in our web services, focusing on two key python-based tools: Marteau, a web-based frontend for easily running load-tests and analyzing the results, and Vaurien, a misbehaving TCP proxy that can simulate a variety of backend failure modes.
Used together, these tools can help ensure that a web service will not only scale up to meet its expected demand, but will fail gracefully when it finally reaches breaking point.
The connection between sub-second web application performance and revenue is becoming more and more apparent with established companies regularly reporting the benefits of reducing page load times.
This talk will cover: Designing for performance Approaches to instrumenting and measuring application performance Areas of focus for both front-end and back-end improvement Techniques, tools and modules available in Django-land for improving performance * New and emerging technologies, for example SPDY protocol and Django 1.5's StreamingHttpResponse
You know that Python and Django is the way forward for your client, but with a mountain of legacy PHP code, where do you start?
Ben spent the last 3 years working with a thriving charity organisation to migrate their large PHP system to Django. He'd like to share some survival strategies.
This talk covers:
- explaining the transition to your client
- first steps and initial experiments
- running PHP and Django in parallel
- why incremental migration beats the "big switch"
- sharing databases and authentication
- making the experience seamless for visitors and staff
- strategies for converting the code
Cython is brilliant, it looks like Python but compiles to native C. It can be used as a simple way of writing lightning-fast C extensions for Python, or for a simple means of hooking into already-existing C libraries. If you are writing CPU intensive applications, like, say, hypothetically, cracking one-way cryptographic functions, Cython is a perfect mixture of simple expressiveness while making sure the 'inner loop' of your code is running as close to the bare metal as possible.
And that's all this talk will be about, honest.
Why are you looking at me like that?
AutoNetkit is an open-source project to automatically build network configuration files for routers including Quagga, Cisco and Juniper, with complicated protocol configurations, all from a simple input graph --- which could even be drawn in a program such as yEd.
AutoNetkit was started at the University of Adelaide, and developed further at Cisco, with collaborators from both research and industry. It is being used at major router vendors, by network operators, and in university teaching, and in research publications including experiments with over 800 virtual routers --- nearly impossible to configure by hand.
This is all powered by Python, and we make significant use of the excellent packages available.
This talk will present a brief overview of what the AutoNetkit project is about and why we chose Python.
I will give an over view of our data model, and how methods such as lt eq contains getattr setattr and iter allow elegant yet expressive network design, such as using list comprehensions.
I will cover how we use the various modules, including: - NetworkX graphs - Netaddr IP addressing - Mako Templates - Exscript for scripting deployment - TextFSM parsing of measurement - Tornado to serve a browser-based d3.js visualisation framework - Using IPython notebooks for interactive tutorials
We've spent quite a bit of time with Python on this project. This talk will pass on the our favourite Python language features and package we've found along the way. It will be both a case study in using Python for a large-scale project, and an overview of useful packages.
Have you tried unit testing? Always meant to add tests to your project but didn't know where to start? This presentation will provide a gentle introduction to unit testing your module, package or entire project.
The standard library comes with the unit test module but a great alternative is py.test. Py.test makes starting to test your project as easy as possible. When you need them it has a full set of tools and testing capabilities.
This presentation will provide an overview of unit testing and then show how easy it is to get started using py.test.