Django community: RSS
This page, updated regularly, aggregates Community blog posts from the Django community.
-
PyCon.de: an admin's cornucopia, python is more than just better bash - Christian Theune
(One of my summaries of a talk at the 2017 PyCon.de conference). A "cornicopia" is a "horn of plenty". It keeps on giving. Pragmatism. I can quickly write some python fast for a task I need to do now. And it will be good enough. You can start right away: you don't need to start designing an architecture beforehand like you'd have to do in java. Often if you fix something quickly, you'll have to fix it a second time a day later. With python, you don't need to write your code twice. Perhaps 1.5 times. You can add tests, you can fix up the code. What do they use from python? Language features. Decorators, context managers, fstrings, meta programming. Python's standard library. You get a lot build-in. Releasing. zc.buildout, pip, virtualenv. Testing. pytest, flake8. Lots of external libraries. Some of these in detail. Context managers. Safely opening and closing files. They had trouble with some corner cases, so they wrote their own context manager that worked with a temporary file and guaranteed you cannot ever see a half-written file. Decorators. Awesome. For instance for automating lock files. Just a (self-written) @locked on a command line function. asyncio. They use … -
PyCon.de: the snake in the tar pit, complex systems with python - Stephan Erb
(One of my summaries of a talk at the 2017 PyCon.de conference). He started with an xkcd comic: Often it feels this way at the beginning of a project, later on it gets harder. You cannot just run import greenfield to get back to a green field again. Most engineering time is spend debugging. Most debugging time is spend looking for information. Most time spend looking for information is because the code/system is unfamiliar. Unfamiliar, unknown code: now we're talking about team size. You're probably debugging someone elses code. Or someone else is debugging your code. What can we do to understand such code? How can we spread the knowledge? You can do informal reasoning. Try to "run the code in your head". Code reviews. Pair programming. By getting better here, we create less bugs. Try to understand it by testing. Treat it as a black box. See what comes out. Add integration tests. Do load tests. Perhaps even chaos engineering. By getting better here we find more bugs. The first is better than the second way, right? Both get harder when it becomes more complex. Complexity destroys understanding. But I need understanding to have confidence. Keep in mind the … -
PyCon.de: graphql in the python world - Nafiul Islam
(One of my summaries of a talk at the 2017 PyCon.de conference). graphql is a query language for your API. You don't call the regular REST API and get the standard responses back, but you ask for exactly what you need. You only get the attributes you need. Graphiql is a graphical explorer for graphql. Github is actually using graphql for its v4 API. He did a demo. The real question to ask: why graphql over REST? There is a standard. No more fights over the right way to do REST. Development environment (graphiql). You get only what you want/need. Types. Lots of companies are using it already. What does python have to offer? graphene. Graphene uses these concepts: Types/objects. More or less serializers. Schema. Collection of objects and mutations. "Your API". Resolver. Query. What you can ask of the API. "You can search for users by username and by email". Mutations. Changes you allow to be made. "You can create a new user and you have to pass a username and email". He demoed it. It looked really comfortable and slick. Some small things: 2.0 is out (today!). The django integration is better than the sqlalchemy integration at the … -
PyCon.de: friday lightning talks
(One of my summaries of a talk at the 2017 PyCon.de conference). Parallel numpy with Bohrium - Dion Häfner He had to port a fortran codebase to numpy. Took a few months, but was quite doable. Just some number crunching, so you can do everything with numpy just fine. For production-running it had to run on parallel hardware. For that he used bohrium, a tool that works just like numpy, but with jit-compiled code. He showed some numbers: a lot faster. Cultural data processing with python - Oliver Götze Cultural data? Catalogs of book archives. Lots of different formats, often proprietary and/or unspecified and with missing data. And with lots of different fields. He wrote a "data preparation tool" so that they can clean up and transform the data to some generic format at the source. The power of git - Peer Wagner What do you think your repositories contain? Code? More! He read a book about "data forensics". git log is ok. But you can pass it arguments so that you can get much more info out of it. You can show which parts of your code are most often edited. You can also look which files are often … -
PyCon.de keynote: dask, next steps in parallel Python - Matthew Rocklin
(One of my summaries of a talk at the 2017 PyCon.de conference). Matthew Rocklin works on dask for anaconda. He showed a demo. Python has a mature analytics stack. numpy, pandas, etcetera. These have one big drawback: they are designed to run in RAM on a single machine. Now... how can we parallellize it? And not only numpy and pandas, but also the libraries that build upon it. What can you do in Python: Embarrassingly parallel: mulitiprocessing, for instance. You can use the multiprocessing library output = map(func, data) becomes output = pool.map(func, data). This is the simplest case. Often it is enough! Big data collections: spark, SQL, linear algebra. It manages parallelism for you within a fixed algorithm. If you stick to one of those paradigms, you can run on a cluster. Problem solved. Task schedulers: airflow, celery. You define a graph of python functions with data dependencies between them. The task scheduler then runs those functions on parallel hardware. How do these solutions hold up Multiprocessing is included with python and well-known. That's good. But it isn't efficient for most of the specific scientific algorithms. Big data: heavy-weight dependency. You'll have to get everyone to choose Spark, for … -
PyCon.de: programming the web of things witn micropython - Hardy Erlinger
(One of my summaries of a talk at the 2017 PyCon.de conference). We're used to programming for computers that have keyboards and a mouse and displays. Hardy Erlinger talked to a fully packed room about "physical computing". Computers with all sorts of sensors like temperature sensors, physical switches and with outputs like motors, LEDs, etc. When you try to teach computing to people, something like print('hello world') often fails to excite people. Once you can get LEDs to blink or servos to move: that helps. Often, you'll see is a single board computer. A complete computer build on a single circuit board. A raspberry pi, for instance. That's already a pretty powerful machine. Effectively a linux machine. Everything's there. Fine. But it is not very handy if you want it to be mobile. "Mobile" meaning "you want to track the movements of your cat", for instance. It is too big to tie to your cat. And it requires quite some electrical energy. The next smaller step in computing: microcontrollers. A computer shrunk into a single very small chip designed for use in an embedded system. It is often used as an embedded "brain" in a larger mechanical or electrical system … -
PyCon.de: empowered by Python - Jens Nie and Peer Wagner
(One of my summaries of a talk at the 2017 PyCon.de conference). Jens and Peer are working with pipeline inspections (for Rosen). (Real-world pipelines of up to 1000km long, not software pipelines). They build their own pipeline inspection robots. There's a lot of measurements coming out of such an inspection. One measurement every millimeter... So they're working with big data. And they're completely based on python. Everything from matplotlib, numpy, scipy, dask. etc. Also the laboratory measurements use python now. They were used to matlab, but python was much nicer and easier and more powerful. In the pipeline industry, they invested lots of money and effort in artificial intelligence. But it just did not work. Lots of overfitting. The time was just not right. A large problem was the lack of enough data. They have that now. And with machine learning, they're getting results. They also told about the history of their software development process. It started out as word documents that were then implemented. Next phase: prototypes in matlab with re-implementation in python. Only the end-users started to discover the prototypes and started using them anyway.... Now they're doing everything in python. And prototypes are now more "minimum viable … -
PyCon.de: keeping grip on decoupled code with CLIs - Anne Matthies
(One of my summaries of a talk at the 2017 PyCon.de conference). Anne is writing Python since 1996 already! She mostly builds data pipelines for analysts. In a big company, those pipelines start to get messy quickly. Her solution: chop everything in those pipelines up. The biggest problem in software that is in use for more than a year: humans. The problems like performance are relatively easy and solvable. She showed some code: # uncomment what you need That was in infrastructure code that deployed something. Everyting is code. Deployment is code. Infrastructure is code. Intalling is code. And all that became messy. And others (like ruby programmers) needed to be able to use those tools/pipelines. The solution: chop everything up in individual packages with a proper setup.py and with command line tools. Everyone can install a python package and call a command line tool! For the command line, they use cliff, a "command line interface formulation framework". With setuptools entry points she could get extra installed libraries to inject their commands into the generic CLI. Photo explanation: picture from our recent cycling holiday (NL+DE). Small stream near Renkum (NL). -
PyCon.de: building your own SDN with linux/saltstack/python - Maximilian Wilhelm
(One of my summaries of a talk at the 2017 PyCon.de conference). SDN? Software defined networking. You can just give a lot of money to cisco, right? Well, such money isn't always available. And it doesn't always do what we want. They needed an SDN for a city-wide point-to-point wifi network between various buildings in Paderborn. Recently he installed a new linux and typed in ifconfig, route, arp? It isn't there anymore. iproute2 is now the swiss army knife for networkers. VXLAN, VRF, MPLS, VLAN-aware bridges, IPsec, OpenVPN: linux has it all build-in. You can use it. Network configuration? It used to be ifupdown, but that is not easily automated. You can change the config file, but reloading is not possible... Restarting the network disrupts the connections... So there's now ifupdown2 written in python. You can extend it. Batteries included: dependency resolution, ifreload, VRFs, VXLAN, VLAN-aware bridges. And: they're open for ideas. You can send pull requests. For their network, they needed a routing solution. There are many open source implementations you can use. One of them, ExaBGP, is even written in Python. They used bird for OSPF. Configuring it all? Salt stack. Continuous management. Extensible. Salt stack works on … -
PyCon.de: use ansible properly or stick to your scripts - Bjoern Meier
(One of my summaries of a talk at the 2017 PyCon.de conference). Ansible is an infrastructure management tool. You have an "inventory" with your hosts and what kinds of hosts they are ('webserver', 'database'), combined with a "playbook" that tells what to do with what kind of host. They started with mapping the various manual deployment steps to ansible tasks. A playbook would just be a list of tasks that call shell scripts. This was wrong. A task would always result in a change. Another big problem? Ansible's check mode (or diff mode) would not work. A shell script cannot be simulated, so "check" will skip it. The solution? Use proper ansible modules. Modules can mostly check the state and determen what should be done. You can write your own modules, which means writing python code. This means you can also properly test your code (which is harder to do with shell scripts). He showed some example code, including code for checking whether something would change. And with a test playbook for testing the module. A common problem is that ansible doesn't know if something changed in your application: does it need to be restarted or not? The "solution" is … -
PyCon.de: effective data analysis with pandas indexes - Alexander Hendorf
(One of my summaries of a talk at the 2017 PyCon.de conference). (Warning beforehand: I hardly know pandas, so my summary might not be totally correct/useful/complete) When he started using pandas, differences between dataseries and dataframe tendet to trip him up often. Series is just like an array. It has a type, as it uses numpy under the hood ("labeled numpy arrays"). It has one type, so a series with ints and floats will be all-floats. Slicing is just series[3:6] or series.iloc[3:6]. He prefers the latter as it is more explicit. A dataframe is a bunch of series with an index (that is also a series). If you slice, you get rows. If you ask for one item, you get a column. It is better if you use .iloc(). A very powerful concept: a boolean index. sales_data['units'] > 40 gives you an index with everything that sold more than 40 items. You can and and or those indexes. Handy for filtering. Multi-index. Handy for data that is hierarchical (country, towns, etc). Datetime index. You can use a function to convert timestamps to actual datetimes. Pandas will now treat it correctly, for instance in plots. You can group by years and … -
PyCon.de: public transport efficiency with geopandas and GTFS - Pieter Mulder
(One of my summaries of a talk at the 2017 PyCon.de conference). Pieter Mulder works at dooor2door on a ride-sharing platform. He works on the research into demand. He uses: geopandas, which extents pandas with a geometical column type (using 'shapely'). You can even reproject a geometry series from one projection to another. GTFS: generic transit feed specification. It is defined by google: a common format for transportation schedules and associated geographical information. It isn't scary, it is basically a zipfile with a bunch of csv files. Geonotebook: adds a map to your jupyter notebook with two-way interaction. He then showed a demo with jupyter. Extracting all stops from local Karlsruhe GTFS files. Plotting them on the map. Searching for bus or tram stops within 5 minutes walking distance. Finding the stops you can reach with a trip of max half an hour. Nice demo! (Something similar is being done with opentripplanner (written in java).) Photo explanation: picture from our recent cycling holiday (NL+DE). Disused railway bridge over the Rhein at Wesel. -
PyCon.de: thursday lightning talks
(One of my summaries of a talk at the 2017 PyCon.de conference). (Note: if you have a name correction, mail me... Lightning talks go very fast) Note two: look at the mp3 player for very small kids talk below. That one was great. Work of a data analyst - Sofia Kosoran She talked yesterday about the difference between academia and industry. She now told an experience she had at her job. She didn't need deep learning or other hard stuff. Just a few minutes with pandas and taking the max meant that she became famous in her company :-) A linux firewall framework - Maximilian Wilhelm Alff: a linux firewall framework. For instance for a distributed linux firewall, based on a central point of truth/configuration. It knows about network topology, services and security classes. You can extend it with python plugins. If you don't know iptables: don't use this tool. See https://github.com/BarbarossaTM/alff Introducing BiCDaS - Nils Hachmeister BiCDaS: Bielefeld center for data science. It is a cross-section of different parts of the university, it is an initiative coming out of a round table. So: interdiciplinary. The participants come from all over the university: biology, sociology, law, technical, IT, etc. Something … -
PyCon.de: the python ecosystem for data science - Christian Staudt
(One of my summaries of a talk at the 2017 PyCon.de conference). Data science often has a similar workflow: acquire, ingest/clean, store/manage, data wrangling, visual analysis, modeling, story-telling. For many of those stages, python has nice tools. Christian Staudt calls it an ecosystem. Well, if you make a diagram showing the various tools it starts to look like one of those biology diagrams showing which kinds of animal eats what other kinds of animals. Likewise, python libraries have their function and their specialized niche. Data wrangling Numpy. The fundamental package for numeric computing in python. N-dimensional arrays. Numpy arrays are different to python lists: they're layed out in memory in a much more effective and compact way. Essential for understanding numpy: "lose your loops". Don't loop over arrays with regular python operations, but use numpy methods. That pushes everything down into highly effecient compiled code. That can gain you an order of magnitude in performance. Pandas. Labled indexed array data structures: series, dataframes, timeseries. It also include operations tailored to it like group_by and filter. And handy data import functionality (csv, excel, etc). He showed a quick example. It pays off to experiment with pandas and to explicitly Dask. Dask … -
PyCon.de keynote: Neutrinos - Susanne Mertens
(One of my summaries of a talk at the 2017 PyCon.de conference). Susanne Mertens is a physics professor in Munich. Neutrinos? Karlsruhe? What is the link? There is the KATRIN experiment with the 'KA' of Karlsruhe. An experiment looking more at neutrons. Computing and programming tools essential for scientific experiments. She'll tell about the physics background and end up with python. She showed some mind-boggling numbers. Huge number of galaxies. Galaxies made of huge numbers of stars. One of them being our sun. Our planet then has 7 billion people. Everyone exists out of a huge number of atoms. Multiply those numbers... But. All those atoms? 4% of the mass of the universe. The other 96%? Dark energy, dark matter. But what is it? There's lots of research into neutrinos. When you measure radioactive decay, you'd expect to measure a fixed amount of energy. But that's not the case. Someone postulated that maybe another unknown particle might be created: a neutrino. There are a lot of neutrinos. But: they hardly interact with regular matter. Enormeous numbers pass right through your body every second. So detecting and measuring them is a problem. It took 30 years to design an experiment to … -
PyCon.de: modern ETL-ing with Python and Airflow - Tamara Mendt
(One of my summaries of a talk at the 2017 PyCon.de conference). ETL? Extract, Transform, Load. It goes hand-in-hand with the traditional data warehousing. But that's the traditional sense. You can also see it as generic data transformation. The principles of ETL-ing are applicable. ETL often implies batch running. You could do it streaming, but most people still use batch processing. There are lots of commercial ETL tools. They have problems, however. They are mostly designed to deal with well structured data. They were made for moving from one DB system to another. They don't match well with the variety of sources you tend to have now. Mostly not open source You're limited to the build-in fuctionality. She suggest using python instead: Easy. More flexible. Reusing logic is easier. Abstraction is possible. You can test the logic. Versioning and collaboration. Airflow is a python platform to programmatically author, schedule and monitor workflows. It is a great tool to have the user-friendliness of commercial ETL tools and the flexibility of Python. It is python! It is open source. Note: it recently became an "apace incubator" project. There's a good community. Really nice: dynamic pieline and task creation. You can create them … -
PyCon.de: Python in space, the N body problem - Daniel Jilg
(One of my summaries of a talk at the 2017 PyCon.de conference). Daniel Jilg starts off with a picture of a celestial body (=sun, planets) movements with a geocentric worldview. When you put the earth in the center, the movements of the planets are hard to map. When you put the sun in the center, the movements make more sense. Circular movements or, as Keppler improved, elliptical movements. Newton improved on it with his law of gravity. Now the movements can be calculated. You can even look at the current planets' movements and look at the small incongruences. Would an extra planet, a long way off, fix those incongruences? Calculate it and start pointing your telescope. They're apparently in the pointing-your-telescope phase for some "planet nine" now. Once the number of celestial bodies gets higher, the calculus gets harder and takes longer. You can express that with the "big O notation". O: Order of magnitude. If you have a loop over all bodies inside a loop over all bodies.... you get n^2. That gets out of hand quite quickly. There's an algorithm that can do better. n log(n), which is much less worse than n^2. The algorithm places the bodies … -
PyCon.de: plugin ecosystems for Python web-applications - Raphael Michel
(One of my summaries of a talk at the 2017 PyCon.de conference). Raphael Michel asked for a show of hands. "Who is a python web developer". "Who used wordpress previously". Quite a number of us have worked with wordpress. About a quarter of the internet runs on wordpress or so. One of the reasons is its plugin mechanism that makes it possible to do lots of things. He works on pretix, a python ticket (concert tickets, not bug tracker tickets) system. They wanted to make it extensible, as there are many special cases (for instance payment processors). Their idea: Establish a way that plugins can hook into your applications. Provide many of these hooks themselves. Document it well They started with a simple signal system. A class that you can register functions on and a send method that simply calls all the registered functions. They set it up so that you could use decorators. If you want to use something like that and you're using django: use django's django.dispatch.signal. If you use django, do also package it up as a 'django applications'. But... if you want to use such an application, you have to: Install the package. Add the app … -
PyCon.de: wednesday lightning talks
(One of my summaries of a talk at the 2017 PyCon.de conference). Lightning talks, so I probably won't get all the names right. If you have additions/corrections, mail me :-) Automatic screenshots of your web app - Raphael Michel Raphael develops a web application. A web app needs documentation. Documentation needs screenshots. Making screenshots by hand is a lot of work. You have to do it over and over again for every new version. Selenium is a tool for instrumenting/steering your browser. He uses chrome --headless. So a proper browser, but without the GUI. py.test handles the various testcases, including the fixtures. Django has a LiveTestServerTestcase that is also handy. The screenshots are written like tests. Really simple and powerfull. Note: djangocon.eu 2018 will be in Heidelberg, Germany. 23-27 May. https://2018.djangocon.eu Counter-intuitive optimizations - Michael Penkov He had to make some counter-intuitive optimizations lately. He gathers data about websites and stores the info in mongodb. 500 million domains, more or less. He wanted, from an existing list, to know which domains weren't in the system yet. Doing it in python and querying the DB was slow. In the end he exported everything to text files and did it with linux … -
PyCon.de: vim your Python, Python your vim - Miroslav Šedivý
(One of my summaries of a talk at the 2017 PyCon.de conference). He has a nice simple keyboard layout. No weird key combinations. He started showing different keyboard layouts and started speaking fluently in en, de, sk, cs, fr, es, it, pl, sv, hu, eo, tr and explaining the various country's keyboard layouts. How do you manage all those languages one ONE keyboard with only ONE brain. Switching keyboards is no option. Charmaps are not easy. There used to be a key called the compose key. It was an actual key on older keyboards. You use compose + two or more keys after each other to get a é or an è, for instance. There's a x11 keymap for it. You can map various keys to function as the compose key. The printscreen key or the right control key or windows key, for instance. Another option is xcape, you can use one key (for instance the caps lock) as both the compose key and as another key. Keeping it pressed make it function as one, just pressing and releasing it quickly as the other. So: he uses the caps lock key as both ctrl and compose key. He then switched … -
PyCon.de: the borgbackup project - Thomas Waldmann
(One of my summaries of a talk at the 2017 PyCon.de conference). Borgbackup is 2.5 years old, but the code is older: is a fork of attic. Thomas discovered Attic after someone blogged about it. They forked it to get more collaboration and quicker releases. Borg backup is a backup tool. There are 1000 backup tools. So what's different? Borg is one you maybe actually would enjoy using. The features sound logical: simple, efficient, safe, secure. How borg sees this: Simple. Each backup is a full backup. Restore? Just do a FUSE mount. Easy pruning of old backups. Tooling: it is just borg, ssh and a shell. It is a single-file binary. There's good filesystem and OS support. There's good documentation. Efficient. It is very fast for unchanged files. Every backup is a full backup, but unchanged files don't need to be handled a second time. Chunk deduplication, sparse file support, flexible compression scheme. Compression is chunk-based, it doesn't compress the whole file at once. Safety. Checksums, transactions, filesystem syncing, atomic operations. Checkpoints while backing up. You can have off-site remote repositories. Secure. Authenticated encryption. There's nothing to see in the repo: borg doesn't trust the backup host, everything is … -
Five stories about the California Wildfires you probably missed
You’ve probably heard about the massive wildfires in Northern California. You probably know that they’re huge, that over 50 people have died, and that some wineries have burned. You might have seen some pictures. Unless you’ve been following closely, though, there’s a lot you’re missing. The vast majority of the reporting has lacked context, been overly sensationalistic, or has outright ignored deeper, more complex stories. It’s far deeper than a story about a natural disaster. -
How to Implement Django's Built In Password Management
Let's implement Django's built... -
Mercurial Mirror For Django 2.0 Branch
The first Beta was released today, so it seems a good day to start the mirror for the 2.0 branch of Django. For the record, main purposes of this mirror are: being a lightweight read-only repository to clone from for production servers hide the ugly git stuff behind a great mercurial interface The clone is […] -
Mailchimp Integration
The Mailchimp API is extensive...