Django community: RSS
This page, updated regularly, aggregates Community blog posts from the Django community.
-
Python中使用Faker创建虚拟数据
在测试数据库时, 我们经常会需要用到假数据来支持代码的运行. 本篇就介绍一下Faker, Faker的唯一功能就是生成半随机的虚假数据, 例如名字, 地址, 域名, 段落等. 创建virtualenv, 并安装Faker: mkvirtualenv test pip install fake-factory 创建假名字: from faker import Factory #---------------------------------------------------------------------- def create_names(fake): """""" for i in range(10): print fake.name() if __name__ == "__main__": fake = Factory.create() create_names(fake) 得到不同的名字, (你得到的应该与我不同): Mrs. Terese Walter MD Jess Mayert Ms. Katerina Fisher PhD Mrs. Senora Purdy PhD Gretchen Tromp Winnie Goodwin Yuridia McGlynn MD Betty Kub Nolen Koelpin Adilene Jerde 不希望名字中含有铅坠河后缀: from faker import Factory #---------------------------------------------------------------------- def create_names2(fake): """""" for i in range(10): name = "%s %s" % (fake.first_name(), fake.last_name()) print name if __name__ == "__main__": fake = Factory.create() create_names2(fake) 接下来我们看如何生成其他虚假数据: from faker import Factory #---------------------------------------------------------------------- def create_fake_stuff(fake): """""" stuff = ["email", "bs", "address", "city", "state", "paragraph"] for item in stuff: print "%s = %s" % (item, getattr(fake, item)()) if __name__ == "__main__": fake = Factory.create() create_fake_stuff(fake) 你可能会得到以下信息: email = pacocha.aria@kris.com bs = reinvent collaborative systems address = 57188 Leuschke Mission Lake Jaceystad, KY 46291 city = West Luvinialand state = Oregon paragraph = Possimus nostrum exercitationem harum eum in. Dicta aut off Faker中还有许多其他方法没有在这里提及到, 你可以自行查看其文档. -
Tinkering with Django and Ember.js
Hi everyone. Recently I have been looking around at the various JavaScript frameworks that have been blooming out recently. Three came in front of the others, namely BackboneJs, AngularJs and EmberJs. After looking at all 3 of them, EmberJs is the one which appeals to me the most. But if you hadn't guessed from the tutorials on this blog, I also like Django a lot and most tutorials out there talking about EmberJs and actually using a data backend are using Ruby on Rails and the few that I could find that were talking about Django were based on TastyPie while I like Django-REST-Framework better... So I decided to dive in and go full speed ahead and learn the hard way. The result of which is Djember-CMS. -
Python中如何重新引入被覆盖的自带function
最近在写python应用时遇到一个问题: 引入某个模块时会自动引入自定义的int到python的namespace中, 从而覆盖了python自带的int function. 因为我们需要使用python的int, 所以不得不找到重新引入这int的方法: 幸运的是, 这一问题还是很容易解决的, 我们只需要使用__builtins__: from __builtins__ import int as py_int 这样一来我们又可以重新使用python的int了, 但在此时叫做py_int. 一个function或变量的被覆盖最常见的原因是在引用时使用了"*": from something import * 当这样使用import时, 我们无法明确的指导究竟引入了哪些变量或function, 也无法知道这些变量或function是否会覆盖原来的变量或function. 所以这也是在使用import时不推荐使用"*"的主要原因之一. 在python 3中, 可以使用builtins代替__builtins__. -
July 2014 ShipIt Day Recap
This past Friday we celebrated another ShipIt day at Caktus. There was a lot of open source contribution, exploring, and learning happening in the office. The projects ranged from native mobile Firefox OS apps, to development on our automated server provisioning templates via Salt, to front-end apps aimed at using web technology to create interfaces where composing new music or performing Frozen’s Let It Go is so easy a anyone can do it. Here is a quick summary of the projects that folks worked on: Calvin worked on updating our own minimal CMS component for editing content on a site, django-pagelets, to work nicely with Django 1.7. He also is interested in adding TinyMCE support and making it easy to upload images and reference them in the block. If you have any other suggestions for pagelets, get in touch with Calvin. Philip worked on a code to tag words in a text with basic information about their etymologies. He was interested in exploring words with dual French and Anglo-Saxon variations eg “Uncouth” and “Rude”. These words have evolved from different origins to have similar meanings in modern English and it turns out that people often perceive the French or Latin … -
July 2014 ShipIt Day Recap
This past Friday we celebrated another ShipIt day at Caktus. There was a lot of open source contribution, exploring, and learning happening in the office. The projects ranged from native mobile Firefox OS apps, to development on our automated server provisioning templates via Salt, to front-end apps aimed at using web technology to create interfaces where composing new music or performing Frozen’s Let It Go is so easy anyone can do it. -
Django访问多个PostgreSQL Schema
django缺少对PostgreSQL的多schema支持, 之前我们尝试了多种方法访问除public schema之外的schemas, 但这些方式都难以维护. 然而, 最近我们发现这一问题可以使用PostgreSQL的search_path参数轻松地解决. 一个简单的例子 假设一个django项目中所有的表都创建在django schema中, 并且我们的项目需要用到legacy schema中几个表. 我们可以通过PostgreSQL数据库映射, 然后在django中使用DATABASE设置实现, 让我们展示一下其他两种不同的设置方法: 在连接时设置search_path 假设django和legacy schema已经存在, 且django项目具有数据库的访问权. 在django设置中, 我们在options中设置search_path: # settings.py DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'OPTIONS': { 'options': '-c search_path=django,public' }, 'NAME': 'multi_schema_db', 'USER': 'appuser', 'PASSWORD': 'secret', }, 'legacy': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'OPTIONS': { 'options': '-c search_path=legacy,public' }, 'NAME': 'multi_schema_db', 'USER': 'appuser', 'PASSWORD': 'secret', }, } 这一设置方式对于已有的项目而言需要修改的地方最少. 设置不同的数据库用户 对于以上设置而言, 最大的问题可能是set search_path参数在每次django项目与PostgreSQL数据库进行连接时都需要传输. 为了节约这一时间, 我们可以事先为每个用户设定search_path: 用postgres用户登入psql shell: -- user accessing django schema... CREATE USER django_user LOGIN PASSWORD 'secret'; GRANT appuser TO django_user; ALTER ROLE django_user SET search_path TO django, public; -- user accessing legacy schema... CREATE USER legacy_user LOGIN PASSWORD 'secret'; GRANT appuser TO legacy_user; ALTER ROLE legacy_user SET search_path TO legacy, public; django项目中settings.py: DATABASES = { 'default': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'multi_schema_db', 'USER': 'django_user', 'PASSWORD': 'secret', }, 'legacy': { 'ENGINE': 'django.db.backends.postgresql_psycopg2', 'NAME': 'multi_schema_db', 'USER': 'legacy_user', 'PASSWORD': 'secret', }, } 以上便是两种不同的使用search_path的方式. 自定义django database router也许也能做到自动选择正确的schema. -
Deferred Tasks and Scheduled Jobs with Celery 3.1, Django 1.7 and Redis
Setting up celery with Django can be a pain, but it doesn't have to be. In this video learn what it takes to setup Celery for deferred tasks, and as your cron replacement. We will use Celery 3.1 and Django 1.7 both introduce changes you need to be aware of.Watch Now... -
lxml的元素构建器和CDATA对象
lxml有着非常好用的元素构建器, 但对于CDATA对象似乎没这么给力: >>> from lxml.builder import E >>> from lxml.etree import CDATA >>> E.stuff(CDATA('Some stuff that needs to be in a CDATA section')) Traceback (most recent call last): File "<ipython-input-4-40103024e8d8>", line 1, in <module> E.stuff(CDATA('Some stuff that needs to be in a CDATA section')) File "/usr/lib/python2.7/dist-packages/lxml/builder.py", line 220, in __call__ raise TypeError("bad argument type: %r" % item) TypeError: bad argument type: <lxml.etree.CDATA object at 0x238a130> 可以使用以下方式修复: from lxml.builder import ElementMaker from lxml.etree import CDATA def add_cdata(element, cdata): assert not element.text, "Can't add a CDATA section. Element already has some text: %r" % element.text element.text = cdata E = ElementMaker(typemap={ CDATA: add_cdata }) 然后就可以正常使用了: >>> from lxml import etree >>> etree.tostring(E.stuff(CDATA('Some stuff that needs to be in a CDATA section'))) '<stuff><![CDATA[Some stuff that needs to be in a CDATA section]]></stuff>' -
如何在Python中发送邮件
使用python发送邮件有以下几种情况: 纯文本的邮件 带附件的邮件 其他邮件 首先我们使用virtualenv创建环境: $ virtualenv env $ env/bin/pip install wheezy.core 纯文本邮件 直接上代码: # plain.py from wheezy.core.mail import MailMessage from wheezy.core.mail import SMTPClient mail = MailMessage( subject='Welcome to Python', content='Hello World!', from_addr='someone@dev.local', to_addrs=['you@dev.local']) client = SMTPClient() client.send(mail) 然后使用以下命令发送: python plain.py 如果需要发送HTML的邮件的话, 只需要将content换成HTML, 将content_type设置成'text/html', 设置charset即可: from wheezy.core.mail import MailMessage from wheezy.core.mail import SMTPClient content = """\ <html><body> <h1>Hello World!</h1> </body></html>""" mail = MailMessage( subject='Welcome to Python', content=content, content_type='text/html', charset='utf-8', from_addr='someone@dev.local', to_addrs=['you@dev.local']) client = SMTPClient() client.send(mail) 另外, 还可以设置SMTPClient的host, port, tls, credentials等. 带附件的邮件 # attachment.py from wheezy.core.mail import Attachment from wheezy.core.mail import MailMessage from wheezy.core.mail import SMTPClient mail = MailMessage( subject='Welcome to Python', content='Hello World!', from_addr='someone@dev.local', to_addrs=['you@dev.local']) mail.attachments.append(Attachment( name='welcome.txt', content='Hello World!')) client = SMTPClient() client.send(mail) 可以使用factory方法Attachment.from_file从本地文件创建一个附件 其他邮件 我们下载python的logo作为邮件: import os.path from wheezy.core.mail import Alternative from wheezy.core.mail import MailMessage from wheezy.core.mail import Related from wheezy.core.mail import SMTPClient mail = MailMessage( subject='Welcome to Python', content='Hello World!', from_addr='someone@dev.local', to_addrs=['you@dev.local']) alt = Alternative("""\ <html><body> <h1>Hello World!</h1> <p><img src="cid:python-logo.gif" /></p> </body></html>""", content_type='text/html') curdir = os.path.dirname(__file__) path = os.path.join(curdir, 'python-logo.gif') alt.related.append(Related.from_file(path)) mail.alternatives.append(alt) client = SMTPClient() client.send(mail) -
Python Dev Tip: DRY your shell with PYTHONSTARTUP
Python Dev Tip: DRY your shell with PYTHONSTARTUP -
可重复使用的 juju charm: ansible role
我们时常会使用juju charms来自动化部署许多不同的paas, 大多数是wsgi应用. 我们通常使用ansible使juju charms自动化部署更加轻松, 但是每个wsgi charm还是得重复坐以下相同的事情: 设置特定的用户 安装build代码到特定目录 安装依赖包 与后台连接(postesql, elasticsearch等) 生成设置 设置wsgi服务 设置log 支持更新代码而不必升级charm 支持不断地更新代码 其中只有三项是会随着paas不同而存在略微差别的: 依赖包, 生成设置和后台连接. 在尝试创建一个可重复使用的wsgi charm后, 我们借助ansible内置的对可重复利用的roles的支持创建了charm-bootstrap-wsgi, 其中包含了以上所有需求. 其中charm非常简单, 重新是使用wsgi-app的role: roles: - role: wsgi-app listen_port: 8080 wsgi_application: example_wsgi:application code_archive: "{{ build_label }}/example-wsgi-app.tar.bzip2" when: build_label != '' 我们只需要做两件事情: tasks: - name: Install any required packages for your app. apt: pkg={{ item }} state=latest update_cache=yes with_items: - python-django - python-django-celery tags: - install - upgrade-charm - name: Write any custom configuration files debug: msg="You'd write any custom config files here, then notify the 'Restart wsgi' handler." tags: - config-changed # Also any backend relation-changed hooks for databases etc. notify: - Restart wsgi 其他则是由reusable wsgi-app role提供支持. -
From LIKE to Full-Text Search (part II)
If you missed it, read the first post of this series What do you do when you need to filter a long list of records for your users? That was the question we set to answer in a previous post. We saw that, for simple queries, built-in filtering provided by your framework of choice (think Django) is just fine. Most of the time, though, you'll need something more powerful. This is where PostgreSQL's full text search facilities comes in handy. We also saw that just using to_tsvector and to_tsquery functions goes a long way filtering your records. But what about documents that contain accented characters? What can we do to optimize performance? How do we integrate this with Django? Hola, Mundo! We have found that the need to search documents in multiple languages is fairly common. You can query your data using to_tsquery without passing a language configuration name but remember that, under the hood, the text search functions always use one. The default language is english, but you have to use the right language stemmer according to your document language or you might not get any matches. If, for example, we search for física in spanish documents that have … -
Django 和 PostgreSQL, 从 SQL 的 LIKE 到全文搜索(Full-Text-Search) (1)
一般我们是这样从超多记录中过滤出相应的query的: &gt;&gt;&gt; Entry.objects.filter(title__icontains='Man bites dog') 这一语句在PostgreSQL中则被转化为: SELECT ... WHERE title ILIKE '%Man bites dog%'; 但对于"Man Bites Dogs Tails"这样的记录还是无法过滤出. PostgreSQL的全文搜索 我们可以使用PostgreSQL的全文搜索功能, 而不是正则表达式来解决这一问题. 因为这则表达式: 没有语言支持, 不支持派生词, 衍生词, 相近词等 不能根据相近性排序 运行慢, 因为没有index支持 当使用全文搜索时, 则可以做到返回衍生词: &gt;&gt;&gt; Entry.objects.search('man is biting a tail') [&gt;Entry: Man Bites Dogs Tails&lt;] Document 和 Query document指的是全文搜索的一个单位, document可以是任何东西, 在这里, 我们定义为title和body栏为一个单一的document. 为了搜索的速度和效率考虑, 数据库使用document的compact representation (tsvetor). compact representation是经过特殊处理的原始内容. 为了使用查询(query) tsvector, 我们需要使用tsquery. tsquery是经过普通化的query. 通过tsvector和tsquery的匹配, 才能完成搜索. 在query中使用他们, 则需要to_tsvector和to_query的帮助, 以下是解决本篇开头问题的一个语句: SELECT ... WHERE to_tsvector(COALESCE(title, '') || ' ' || COALESCE(body, '')) @@ to_tsquery('man & bites & dog'); 关于 Stem 为了将搜索项和document文字整合起来, PostgreSQL使用了stemming词典. 显示PostgreSQL中已安装的词典: => \dFd List of text search dictionaries Schema | Name | Description ------------+-----------------+------------------------------- pg_catalog | danish_stem | snowball stemmer for danish language pg_catalog | dutch_stem | snowball stemmer for dutch language pg_catalog | english_stem | snowball stemmer for english language pg_catalog | finnish_stem | snowball stemmer for finnish language pg_catalog | french_stem | snowball stemmer for french language pg_catalog | german_stem | snowball stemmer for german language pg_catalog | hungarian_stem | snowball stemmer for hungarian language pg_catalog | italian_stem | snowball stemmer for italian language pg_catalog | norwegian_stem | snowball stemmer for norwegian language pg_catalog | portuguese_stem | snowball stemmer for portuguese language pg_catalog | romanian_stem | snowball stemmer for romanian language pg_catalog | … -
Creating a custom user model in Django 1.6 - Part 5
Hello everyone and welcome back to the fifth part of this tutorial (part of the Babbler tutorial series). If you haven't read the first four parts I'd encourage you to do so now: part 1: the model part 2: migration and admin forms part 3: frontend login form part 4: admin login form This week we will cover user registration with e-mail validation as well as the "lost / change password" workflow.As we did two weeks ago we will be using (generic) class-based views to build our forms and Crispy forms to render them. We will also be using Templated emails to handle our email needs and we will create our first management command. -
如何在 Django models 中使用多语言 (i18n) 的简单方法
本篇介绍一个在django model中使用多语言支持的快速方法, 该方法通过建立自定义的template tag 选取model中重复的语言field来达到多语言显示的目的. 假设我们有这样一个models.py, 某一个model中包含多个重复的field, 每个重复的field都是用来保存其对应的显示语言: class MyObject(models.Model): name = models.CharField(max_length=50) title_en = models.CharField(max_length=50) title_es = models.CharField(max_length=100) title_fr = models.CharField(max_length=100) description_en = models.CharField(max_length=100) description_es = models.CharField(max_length=100) description_fr = models.CharField(max_length=100) class MyOtherObject(models.Model): name = models.CharField(max_length=50) content_en = models.CharField(max_length=200) content_es = models.CharField(max_length=200) content_fr = models.CharField(max_length=200) 注意, 我们将下划线和语言代码作为后缀放在对应的field后面, 这将作为一个语言的查找标记. 然后我们在settings.py中添加需要翻译的field名: TRANSLATION_FIELDS = ('title', 'description', 'content') 在项目目录中添加templatetags目录(不要忘了怎家__init__.py), 并在其中建立lazy_tags.py: from django import template from settings import TRANSLATION_FIELDS register = template.Library() class LocalizedContent(template.Node): def __init__(self, model, language_code): self.model = model self.lang = language_code def render(self, context): model = template.resolve_variable(self.model, context) lang = template.resolve_variable(self.lang, context) for f in TRANSLATION_FIELDS: try: setattr(model, f, getattr(model, '%s_%s' % (f, lang))) except AttributeError: pass return '' @register.tag(name='get_localized_content') def get_localized_content(parser, token): bits = list(token.split_contents()) if len(bits) != 3: raise template.TemplateSyntaxError("'get_localized_content' tag takes exactly 2 arguments") return LocalizedContent(model=bits[1], language_code=bits[2]) 为了在template中使用自定义的tag, 我们首先载入: {% load lazy_tags %} 然后使用自定义tag, 传入object和语言代码, 取的翻译. 比如西班牙语: {% get_localized_content object 'es' %} 此时, 如果没有语言代码传入, 那么无法使用obj.description调用某一个语言field. 所以我们配合django.core.context_processors.request, context processor一起使用: TEMPLATE_CONTEXT_PROCESSORS = ( ... 'django.core.context_processors.request', ) 我们就能在template中这样使用: {% get_localized_content object request.LANGUAGE_CODE %} -
Release 1.2b3
-
Release 1.2b2
-
Using Chart.js with Django
Chart.js is the new kid on on the block for JavaScript charts. Learn how to use them here to chart out the number of user registrations for the last 30 days. The View This view builds an array of people that registered on the site daily for the last 30 days. from django.views.generic import TemplateView from django.contrib.auth.models import User import arrow class AnalyticsIndexView(TemplateView): template_name = 'analytics/admin/index.html' def get_context_data(self, **kwargs): context = super(AnalyticsIndexView, self).get_context_data(**kwargs) context['30_day_registrations'] = self.thirty_day_registrations() return context def thirty_day_registrations(self): final_data = [] date = arrow.now() for day in xrange(1, 30): date = date.replace(days=-1) count = User.objects.filter( date_joined__gte=date.floor('day').datetime, date_joined__lte=date.ceil('day').datetime).count() final_data.append(count) return final_data The method thirty_day_registrations loops through from 1 to 30, and gets the count of registrations for that day. Then it returns that array back to the get_context_data method and assigns it to 30_day_registrations which is what we will use in our template. The Template The template is very basic in that it has just enough data to generate a line chart. {% extends "base.html" %} {% block extrahead %} <script src="//cdnjs.cloudflare.com/ajax/libs/Chart.js/0.2.0/Chart.min.js" type="text/javascript"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.1.1/jquery.min.js" type="text/javascript"></script> <script type="text/javascript"> $( document ).ready(function() { var data = { labels: ['1', '5', '10', '15', '20', '25', '30'], datasets: [ { label: "Site … -
Tips for Upgrading Django
From time to time we inherit code bases running outdated versions of Django and part of our work is to get them running a stable and secure version. In the past year we've done upgrades from versions as old as 1.0 and we've learned a few lessons along the way. Tests are a Must You cannot begin a major upgrade without planning how you are going to test that the site works after the upgrade. Running your automated test suite should note warnings for new or pending deprecations. If you don’t have an automated test suite then now would be a good time to start one. You don't need 100% coverage, but the more you have, the more confident you will feel about the upgrade. Integration tests with Django's TestClient can help cover a lot of ground with just a few tests. You'll want to use these sparingly because they tend to be slow and fragile. However, you can use them to test your app much like a human might do, submitting forms (both valid and invalid), and navigating to various pages. As you get closer to your final target version or you find more edge cases, you can add focused unittests to … -
Release 1.2b2
-
Release 1.2b1
-
将 Django 作为 bootstrap 的后台框架
如果你想在Django中使用Bootstrap作为前台框架, 但又不知道如何将他们整合到一起的话, 那么本篇我们就介绍一下使用django自带的staticfiles app将Bootstrip整合到Django中: 首先我们需要在GetBootstrip.com下载Bootstrip. 然后我们解压缩zip文件, 将加压缩的文件放入项目中. 接着在settings.py的STATICFILES_DIRS中添加bootstrip目录. 最后我们使用以下base.html为基础: <!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"> <title>Bootstrap | {% block title %}Untitled{% endblock %}</title> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="description" content=""> <meta name="author" content=""> <!-- Le styles --> <link href="{{STATIC_URL}}css/bootstrap.css" rel="stylesheet"> <style> body { padding-top: 60px; /* 60px to make the container go all the way to the bottom of the topbar */ } </style> <link href="{{STATIC_URL}}css/bootstrap-responsive.css" rel="stylesheet"> <!-- Le HTML5 shim, for IE6-8 support of HTML5 elements --> <!--[if lt IE 9]> <script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script> <![endif]--> <script src="{{STATIC_URL}}js/jquery-1.8.1.min.js" type="text/javascript"></script> {% block extrahead %} {% endblock %} <script type="text/javascript"> $(function(){ {% block jquery %} {% endblock %} }); </script> </head> <body> <div class="navbar navbar-inverse navbar-fixed-top"> <div class="navbar-inner"> <div class="container"> <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse"> <span class="icon-bar"></span> <span class="icon-bar"></span> <span class="icon-bar"></span> </a> <a class="brand" href="/">Bootstrap</a> <div class="nav-collapse collapse"> <ul class="nav"> <li class="active"><a href="/">Home</a></li> <li><a href="#about">About</a></li> <li><a href="#contact">Contact</a></li> </ul> </div><!--/.nav-collapse --> </div> </div> </div> <div id="messages"> {% if messages %} {% for message in messages %} <div class="alert alert-{{message.tags}}"> <a class="close" data-dismiss="alert">×</a> {{message}} </div> {% endfor %} {% endif %} </div> <div class="container"> {% block content %} {% endblock %} </div> <!-- /container --> </body> </html> 有了以上base.html之后, 我们就可以按照自己的需求修改并获得最佳效果了. -
使用 pyrrd 收集系统信息
上星期, 我们想测试几个WSGI应用的评分, 由于不同的concurrency模式, 使我们很难一一设置并评测. 其中的系统信息也是一项重要的指标, 例如CPU消耗, 内存消耗等信息的收集和图像化. RRDtool是一个广泛使用的测试工具. 我们可以通过pyrrd和subprocess模块对每个WSGI应用进行测试: from pyrrd.rrd import DataSource, RRA, RRD dss = [ DataSource(dsName='cpu', dsType='GAUGE', heartbeat=4), DataSource(dsName='mem', dsType='GAUGE', heartbeat=4) ] rras = [RRA(cf='AVERAGE', xff=0.5, steps=1, rows=100)] rrd = RRD('/tmp/heartbeat.rrd', ds=dss, rra=rras, step=1) rrd.create() 以上代码将会使用CPU和内存实用信息创建/tem/heartbeat.rrd文件. 两者都定义为GAUGE类型. 然后我们定义了round-robin archive(RRA)来储存最多100个数据点. 在代码最后, 我们使用之前的设置, 以每秒的速度保存到RRD文件中. pyrrd模块使用的属于与rrdtool相同, 因此我们可以使用现成的rrdtool知识. 使用subprocess: pattern = re.compile('\s+') command = '/bin/ps --no-headers -o pcpu,pmem -p %s' % ' '.join(pids) while True: ps = subprocess.check_output(command, shell=True) pcpu = 0.0 pmem = 0.0 for line in ps.split('\n'): if line.strip(): cpu, mem = map(float, pattern.split(line.strip())) pcpu += cpu pmem += mem rrd.bufferValue(time.time(), pcpu, pmem) rrd.update() time.sleep(1) 使用ps命令过滤显示每个pid的%CPU和%MEM使用信息, 然后输出经过处理保存到rrd文件. 请注意, 这不是一个典型的rrdtool使用案例. 通常的使用情形是: # -*- coding: utf-8 -*- from __future__ import (absolute_import, division, print_function, with_statement, unicode_literals) import re import signal import sys import time import subprocess from random import random, randint from pyrrd.rrd import DataSource, RRA, RRD def usage(): print('%s rrdfile pid, ...' % __file__) print('sample the %cpu and %mem for specified pids, aggregated') def sigint_handler(signal, frame): print("Sampling finishes: %.2f." % time.time()) sys.exit(0) def main(): rrd_file = sys.argv[1] pids = sys.argv[2:] dss = [ DataSource(dsName='cpu', dsType='GAUGE', heartbeat=4), DataSource(dsName='mem', dsType='GAUGE', heartbeat=4) ] rras = [RRA(cf='AVERAGE', xff=0.5, steps=1, rows=100)] rrd = RRD(rrd_file, ds=dss, rra=rras, step=1) rrd.create() signal.signal(signal.SIGINT, sigint_handler) pattern = re.compile('\s+') command = '/bin/ps … -
Release 1.2b1
-
在Python模块顶层运行的代码引起的一个Bug
几个星期前, 我的同事跑过来, 说发现一个奇怪的Bug: 在使用Python的subprocess运行子进程时, 当子进程运行失败时居然没有抛出错误! 然后我们在Interactive Python prompt中测试了一下: >>> import subprocess >>> subprocess.check_call("false") 0 而在其他机器运行相同的代码时, 却正确的抛出了错误: >>> subprocess.check_call("false") Traceback (most recent call last): File "", line 1, in File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 542, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'false' returned non-zero exit status 1 看来是subprecess误以为子进程成功的退出了导致的原因. 深入分析 第一眼看上去, 这一问题应该是Python自身或操作系统引起的. 这到底是怎么发生的? 于是我的同事查看了subprocess的wait()方法: def wait(self): """Wait for child process to terminate. Returns returncode attribute.""" while self.returncode is None: try: pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) except OSError as e: if e.errno != errno.ECHILD: raise # This happens if SIGCLD is set to be ignored or waiting # for child processes has otherwise been disabled for our # process. This child is dead, we can't get the status. pid = self.pid sts = 0 # Check the pid and loop as waitpid has been known to return # 0 even without WNOHANG in odd situations. issue14396. if pid == self.pid: self._handle_exitstatus(sts) return self.returncode 可见, 如果os.waitpid的ECHILD检测失败, 那么错误就不会被抛出. 通常, 当一个进程结束后, 系统会继续记录其信息, 直到母进程调用wait()方法. 在此期间, 这一进程就叫"zombie". 如果子进程不存在, 那么我们就无法得知其是否成功还是失败了. 以上代码还能解决另外一个问题: Python默认认为子进程成功退出. 大多数情况下, 这一假设是没问题的. 但当一个进程明确表明忽略子进程的SIGCHLD时, waitpid()将永远是成功的. 回到原来的代码中 我们是不是在我们的程序中明确设置忽略SIGCHLD? 不太可能, 因为我们使用了大量的子进程, 但只有极少数情况下才出现同样的问题. 再使用git grep后, 我们发现只有在一段独立代码中, 我们忽略了SIGCHLD. 但这一代吗根本就不是程序的一部分, 只是引用了一下. 一星期后 一星期后, 这一错误又再一次发生. 并且通过简单的调试, 在debugger中重现了该错误. 经过一些测试, 我们确定了正是由于程序忽略了SIGCHLD才引起的这一bug. 但这是怎么发生的呢? 我们查看了那段独立代码, 其中有一段: signal.signal(signal.SIGCHLD, signal.SIG_IGN) 我们是不是无意间import了这段代码到程序中? 结果显示我们的猜测是正确的. 当import了这段代码后, 由于以上语句是在这一module的顶层, 而不是在一个function中, 导致了它的运行, 忽略了SIGCHLD, 从而导致了子进程错误没有被抛出! 总结 这一bug的发生, …