Django community: RSS
This page, updated regularly, aggregates Community blog posts from the Django community.
-
Want to work for Eventbrite?
Join me, Andrew Godwin (South, Django migrations), Simon Willison (co-founder of Django, co-founder of Lanyrd), and many other talented people at Eventbrite. We have great challenges, the kind that inspire you to rise to the occasion. We need you to help us overcome them. I should mention that Eventbrite is committed to giving back to the community. Most notably Eventbrite just contributed £5000 to the Django Rest Framework kickstarter, or about US$8500!!. We're a frequent sponsor of events around the world. It doesn't stop there, as Eventbrite managers during the discussion of any tool outside our domain of running events will ask: "When can we open source this?" As someone who loves working on open source, Eventbrite is the place to be. I say this because I know what we're planning to do in the future. If you join us, you'll find out sooner rather than later. ;) What's Eventbrite like as a company? Well, we're rated in the top 20 of best places to work in the United States. We get full benefits, free lunch, educational opportunities, and much more. In addition, I have to say that my co-workers are friendly, intelligent, always learning, and love to do things … -
Want to work for Eventbrite?
Join me, Andrew Godwin (South, Django migrations), Simon Willison (co-founder of Django, co-founder of Lanyrd), and many other talented people at Eventbrite. We have great challenges, the kind that inspire you to rise to the occasion. We need you to help us overcome them. I should mention that Eventbrite is committed to giving back to the community. Most notably Eventbrite just contributed £5000 to the Django Rest Framework kickstarter, or about US$8500!! We're a frequent sponsor of events around the world. It doesn't stop there, as Eventbrite managers during the discussion of any tool outside our domain of running events will ask: "When can we open source this?" As someone who loves working on open source, Eventbrite is the place to be. I say this because I know what we're planning to do in the future. If you join us, you'll find out sooner rather than later. ;) What's Eventbrite like as a company? Well, we're rated in the top 20 of best places to work in the United States. We get full benefits, free lunch, educational opportunities, and much more. In addition, I have to say that my co-workers are friendly, intelligent, always learning, and love to do things … -
网站压力测试工具
转自: http://www.yeolar.com/note/2012/11/24/web-bench-test/ 网站压力测试就是测试网站能够承受多大的访问量,以及在大访问量的情况下网站的性能。这些指标会直接影响用户的体验,因此在网站上线前一般都要做压力测试。压力测试也是考察网站使用的相关web服务器和框架的一个重要手段。 因为和真实的环境不同,压力测试通过模拟得到的结果不会和实际的负载完全相同,但它仍是一个很好的基准比较。做压力测试时也会尽可能地模拟实际的情况。 网上大家推荐的较为常见的网站压力测试工具有ab、webbench、http_load、siege、curl-loader、multi-mechanize、tcpcopy等。 这里的大部分测试工具采用事件驱动模型来创建模拟用户,比如ab使用 apr 包中的 apr_pollset_poll 函数,而其他的工具都使用 select 函数,只有webbench通过 fork 子进程来创建模拟用户,它能模拟的并发数更高。 我把大家的一些总结列在这里: ab Apache自带的压力测试工具,还有一个 独立版本 。主要用于测试网站的每秒处理请求数,多用于静态压力测试。基本用法是: $ ab -n 1000 -c 50 http://192.168.1.101/ -n 总请求数 -c 并发连接数 输出的结果如下: Server Software: Apache/2.2.16 Server Hostname: 192.168.1.101 Server Port: 80 Document Path: / 请求文档路径 Document Length: 14643 bytes 请求文档大小 Concurrency Level: 50 并发数 Time taken for tests: 38.724 seconds 总测试时间 Complete requests: 1000 全部请求数 Failed requests: 14 失败请求数 (Connect: 0, Receive: 0, Length: 14, Exceptions: 0) Write errors: 14 Total transferred: 14847500 bytes 总数据传输量 HTML transferred: 14548500 bytes HTML数据传输量 Requests per second: 25.82 [#/sec] (mean) 平均每秒请求数 Time per request: 1936.210 [ms] (mean) 平均每次并发所有请求时间 Time per request: 38.724 [ms] (mean, across all concurrent requests) 平均每次请求时间 Transfer rate: 374.43 [Kbytes/sec] received 传输速率 Connection Times (ms) min mean[+/-sd] median max Connect: 2 668 1905.0 135 12237 连接时间 Processing: 0 1244 1652.3 902 14963 处理时间 Waiting: 0 1222 1651.1 883 14955 等待时间 Total: 134 1912 2723.5 1126 15096 Percentage of the requests served within a certain time (ms) 50% 1126 66% 1321 75% 1369 80% 1408 90% 1917 95% 10122 98% 13030 99% 13884 100% 15096 (longest request) webbench 它主要测试每秒请求数,同时支持静态、动态和SSL,单例最多可模拟3万并发,适合小型网站的压力测试。 基本用法是: $ webbench -c 100 -t 60 http://192.168.1.101/ -c 并发数 -t 测试时间 测试结果如下: Benchmarking: … -
Django 和 PostgreSQL, 从 SQL 的 LIKE 到全文搜索(Full-Text-Search) (2)
在上一篇中, 我们解决了明确搜索的问题, 这一篇中我们说说口音或相近语的问题. 在使用全文搜索是我们会发现, 使用多种语言搜索document是常有的事情. 我们可以不设置语言而是用to_tsquery, 但是在运行的过程中, 全文搜素总是会自动使用至少一个. 默认的语言设置时英语, 但你必须根据你document的语言使用正确的stemmer, 否则就无法找到匹配. 例如, 我们在西班牙语的document中搜索física, 能得到精确地匹配: => SELECT text FROM terms WHERE to_tsvector(text) @@ to_tsquery('física'); text ------------------------------------------------------------------------- física (aparatos e instrumentos de —) física (educación —) física (investigación en —) rehabilitación física (aparatos de —) para uso médico educación física conversión de datos y programas informáticos, excepto conversión física investigación en física terapia física (8 rows) 但如果搜索fisica, 不带口音设置, 则无法得到任何结果: => SELECT text FROM terms WHERE to_tsvector(text) @@ to_tsquery('fisica'); text ------ (0 rows) 为了能在结果中显示física和其变体(físicas, físico, físicamente, 等), 我们必须使用正确的stemmer. 如果stemmer中没有这个词, 那么我们也无法获得正确的结果: => SELECT ts_lexize('english_stem', 'programming'); ts_lexize ----------- {program} (1 row) => SELECT ts_lexize('spanish_stem', 'programming'); ts_lexize --------------- {programming} (1 row) 但当使用正确的语言设置时: => SELECT text FROM terms WHERE to_tsvector('spanish', text) @@ to_tsquery('spanish', 'física'); text -------------------------------------------------------------------------------- física (aparatos e instrumentos de —) ejercicios físicos (aparatos para —) entrenamiento físico (aparatos de —) físicos (aparatos para ejercicios —) física (educación —) preparador físico personal [mantenimiento físico] (servicios de —) física (investigación en —) ejercicio físico (aparatos de —) para uso médico rehabilitación física (aparatos de —) para uso médico aparatos para ejercicios físicos almacenamiento de soportes físicos de datos o documentos electrónicos clases de mantenimiento físico clubes deportivos [entrenamiento y mantenimiento físico] educación física conversión de datos o … -
Contributing Back to Symposion
Recently Caktus collaborated with the organizers of PyOhio, a free regional Python conference, to launch the PyOhio 2014 conference website. The conference starts this weekend, July 26 - 27. As in prior years, the conference web site utilizes Eldarion’s Symposion, an opensource conference management system. Symposion powers a number of annual conference sites including PyCon and DjangoCon. In fact, as of this writing, there are 78 forks of Symposion, a nod to its widespread use for events both large and small. This collaboration afforded us the opportunity to abide by one our core tenets, that of giving back to the community. PyOhio organizers had identified a few pain points during last year’s rollout that were resolvable in a manner that was conducive to contributing back to Symposion so that future adopters could benefit from this work. The areas we focused on were migration support, refining the user experience for proposal submitters and sponsor applicants, and schedule building. Migration Support https://github.com/pinax/symposion/pull/47 The majority of our projects utilize South for tracking database migrations. They are not an absolute requirement but for those conferences that reused the same code base from year to year, rather than starting a new repository, it would be … -
Contributing Back to Symposion
Recently Caktus collaborated with the organizers of PyOhio, a free regional Python conference, to launch the PyOhio 2014 conference website. The conference starts this weekend, July 26 - 27. As in prior years, the conference web site utilizes Eldarion’s Symposion, an opensource conference management system. Symposion powers a number of annual conference sites including PyCon and DjangoCon. In fact, as of this writing, there are 78 forks of Symposion, a nod to its widespread use for events both large and small. This collaboration afforded us the opportunity to abide by one our core tenets, that of giving back to the community. -
TCP 的那些事儿(下)
上篇中,我们介绍了TCP的协议头、状态机、数据重传中的东西。但是TCP要解决一个很大的事,那就是要在一个网络根据不同的情况来动态调整自己的发包的速度,小则让自己的连接更稳定,大则让整个网络更稳定。在你阅读下篇之前,你需要做好准备,本篇文章有好些算法和策略,可能会引发你的各种思考,让你的大脑分配很多内存和计算资源,所以,不适合在厕所中阅读。 TCP的RTT算法p TCP的RTT算法 设长了,重发就慢,丢了老半天才重发,没有效率,性能差; 设短了,会导致可能并没有丢就重发。于是重发的就快,会增加网络拥塞,导致更多的超时,更多的超时导致更多的重发。 而且,这个超时时间在不同的网络的情况下,根本没有办法设置一个死的值。只能动态地设置。 为了动态地设置,TCP引入了RTT——Round Trip Time,也就是一个数据包从发出去到回来的时间。这样发送端就大约知道需要多少的时间,从而可以方便地设置Timeout——RTO(Retransmission TimeOut),以让我们的重传机制更高效。 听起来似乎很简单,好像就是在发送端发包时记下t0,然后接收端再把这个ack回来时再记一个t1,于是RTT = t1 – t0。没那么简单,这只是一个采样,不能代表普遍情况。 经典算法 RFC793 中定义的经典算法是这样的: 首先,先采样RTT,记下最近好几次的RTT值。 然后做平滑计算SRTT( Smoothed RTT)。公式为:(其中的 α 取值在0.8 到 0.9之间,这个算法英文叫Exponential weighted moving average,中文叫:加权移动平均) SRTT = ( α * SRTT ) + ((1- α) * RTT) 开始计算RTO。公式如下:RTO = min [ UBOUND, max [ LBOUND, (β * SRTT) ] ] 其中: UBOUND是最大的timeout时间,上限值 LBOUND是最小的timeout时间,下限值 β 值一般在1.3到2.0之间。 Karn / Partridge 算法 但是上面的这个算法在重传的时候会出有一个终极问题——你是用第一次发数据的时间和ack回来的时间做RTT样本值,还是用重传的时间和ACK回来的时间做RTT样本值? 这个问题无论你选那头都是按下葫芦起了瓢。 如下图所示: 情况(a)是ack没回来,所以重传。如果你计算第一次发送和ACK的时间,那么,明显算大了。 情况(b)是ack回来慢了,但是导致了重传,但刚重传不一会儿,之前ACK就回来了。如果你是算重传的时间和ACK回来的时间的差,就会算短了。 所以1987年的时候,搞了一个叫Karn / Partridge Algorithm,这个算法的最大特点是——忽略重传,不把重传的RTT做采样(你看,你不需要去解决不存在的问题)。 但是,这样一来,又会引发一个大BUG——如果在某一时间,网络闪动,突然变慢了,产生了比较大的延时,这个延时导致要重转所有的包(因为之前的RTO很小),于是,因为重转的不算,所以,RTO就不会被更新,这是一个灾难。 于是Karn算法用了一个取巧的方式——只要一发生重传,就对现有的RTO值翻倍(这就是所谓的 Exponential backoff),很明显,这种死规矩对于一个需要估计比较准确的RTT也不靠谱。 Jacobson / Karels 算法 前面两种算法用的都是“加权移动平均”,这种方法最大的毛病就是如果RTT有一个大的波动的话,很难被发现,因为被平滑掉了。所以,1988年,又有人推出来了一个新的算法,这个算法叫Jacobson / Karels Algorithm(参看RFC6289)。这个算法引入了最新的RTT的采样和平滑过的SRTT的差距做因子来计算。 公式如下:(其中的DevRTT是Deviation RTT的意思) SRTT = SRTT + α (RTT – SRTT) —— 计算平滑RTT DevRTT = (1-β)*DevRTT + β*(|RTT-SRTT|) ——计算平滑RTT和真实的差距(加权移动平均) RTO= µ * SRTT + ∂ *DevRTT —— 神一样的公式 (其中:在Linux下,α = 0.125,β = 0.25, μ = 1,∂ = 4 ——这就是算法中的“调得一手好参数”,nobody knows why, it just works…) 最后的这个算法在被用在今天的TCP协议中(Linux的源代码在:tcp_rtt_estimator) TCP滑动窗口 需要说明一下,如果你不了解TCP的滑动窗口这个事,你等于不了解TCP协议。我们都知道,TCP必需要解决的可靠传输以及包乱序(reordering)的问题,所以,TCP必需要知道网络实际的数据处理带宽或是数据处理速度,这样才不会引起网络拥塞,导致丢包。 所以,TCP引入了一些技术和设计来做网络流控,Sliding Window是其中一个技术。 前面我们说过,TCP头里有一个字段叫Window,又叫Advertised-Window,这个字段是接收端告诉发送端自己还有多少缓冲区可以接收数据。于是发送端就可以根据这个接收端的处理能力来发送数据,而不会导致接收端处理不过来。 为了说明滑动窗口,我们需要先看一下TCP缓冲区的一些数据结构: 上图中,我们可以看到: 接收端LastByteRead指向了TCP缓冲区中读到的位置,NextByteExpected指向的地方是收到的连续包的最后一个位置,LastByteRcved指向的是收到的包的最后一个位置,我们可以看到中间有些数据还没有到达,所以有数据空白区。 发送端的LastByteAcked指向了被接收端Ack过的位置(表示成功发送确认),LastByteSent表示发出去了,但还没有收到成功确认的Ack,LastByteWritten指向的是上层应用正在写的地方。 于是: 接收端在给发送端回ACK中会汇报自己的AdvertisedWindow = MaxRcvBuffer – LastByteRcvd – 1; 而发送方会根据这个窗口来控制发送数据的大小,以保证接收方可以处理。 下面我们来看一下发送方的滑动窗口示意图: 上图中分成了四个部分,分别是:(其中那个黑模型就是滑动窗口) #1已收到ack确认的数据。 #2发还没收到ack的。 #3在窗口中还没有发出的(接收方还有空间)。 #4窗口以外的数据(接收方没空间) 下面是个滑动后的示意图(收到36的ack,并发出了46-51的字节): 下面我们来看一个接受端控制发送端的图示: Zero Window 上图,我们可以看到一个处理缓慢的Server(接收端)是怎么把Client(发送端)的TCP Sliding Window给降成0的。此时,你一定会问,如果Window变成0了,TCP会怎么样?是不是发送端就不发数据了?是的,发送端就不发数据了,你可以想像成“Window Closed”,那你一定还会问,如果发送端不发数据了,接收方一会儿Window size 可用了,怎么通知发送端呢? 解决这个问题,TCP使用了Zero Window Probe技术,缩写为ZWP,也就是说,发送端在窗口变成0后,会发ZWP的包给接收方,让接收方来ack他的Window尺寸,一般这个值会设置成3次,第次大约30-60秒(不同的实现可能会不一样)。如果3次过后还是0的话,有的TCP实现就会发RST把链接断了。 注意:只要有等待的地方都可能出现DDoS攻击,Zero Window也不例外,一些攻击者会在和HTTP建好链发完GET请求后,就把Window设置为0,然后服务端就只能等待进行ZWP,于是攻击者会并发大量的这样的请求,把服务器端的资源耗尽。(关于这方面的攻击,大家可以移步看一下Wikipedia的SockStress词条) 另外,Wireshark中,你可以使用tcp.analysis.zero_window来过滤包,然后使用右键菜单里的follow TCP stream,你可以看到ZeroWindowProbe及ZeroWindowProbeAck的包。 Silly Window Syndrome Silly Window Syndrome翻译成中文就是“糊涂窗口综合症”。正如你上面看到的一样,如果我们的接收方太忙了,来不及取走Receive Windows里的数据,那么,就会导致发送方越来越小。到最后,如果接收方腾出几个字节并告诉发送方现在有几个字节的window,而我们的发送方会义无反顾地发送这几个字节。 要知道,我们的TCP+IP头有40个字节,为了几个字节,要达上这么大的开销,这太不经济了。 另外,你需要知道网络上有个MTU,对于以太网来说,MTU是1500字节,除去TCP+IP头的40个字节,真正的数据传输可以有1460,这就是所谓的MSS(Max Segment Size)注意,TCP的RFC定义这个MSS的默认值是536,这是因为 RFC 791里说了任何一个IP设备都得最少接收576尺寸的大小(实际上来说576是拨号的网络的MTU,而576减去IP头的20个字节就是536)。 如果你的网络包可以塞满MTU,那么你可以用满整个带宽,如果不能,那么你就会浪费带宽。(大于MTU的包有两种结局,一种是直接被丢了,另一种是会被重新分块打包发送) 你可以想像成一个MTU就相当于一个飞机的最多可以装的人,如果这飞机里满载的话,带宽最高,如果一个飞机只运一个人的话,无疑成本增加了,也而相当二。 所以,Silly Windows Syndrome这个现像就像是你本来可以坐200人的飞机里只做了一两个人。 要解决这个问题也不难,就是避免对小的window size做出响应,直到有足够大的window size再响应,这个思路可以同时实现在sender和receiver两端。 如果这个问题是由Receiver端引起的,那么就会使用 David D Clark’s 方案。在receiver端,如果收到的数据导致window size小于某个值,可以直接ack(0)回sender,这样就把window给关闭了,也阻止了sender再发数据过来,等到receiver端处理了一些数据后windows size 大于等于了MSS,或者,receiver buffer有一半为空,就可以把window打开让send … -
TCP 的那些事儿(上)
转自: http://coolshell.cn/articles/11564.html TCP是一个巨复杂的协议,因为他要解决很多问题,而这些问题又带出了很多子问题和阴暗面。所以学习TCP本身是个比较痛苦的过程,但对于学习的过程却能让人有很多收获。关于TCP这个协议的细节,我还是推荐你去看W.Richard Stevens的《TCP/IP 详解 卷1:协议》(当然,你也可以去读一下RFC793以及后面N多的RFC)。另外,本文我会使用英文术语,这样方便你通过这些英文关键词来查找相关的技术文档。 之所以想写这篇文章,目的有三个 一个是想锻炼一下自己是否可以用简单的篇幅把这么复杂的TCP协议描清楚的能力。 另一个是觉得现在的好多程序员基本上不会认认真真地读本书,喜欢快餐文化,所以,希望这篇快餐文章可以让你对TCP这个古典技术有所了解,并能体会到软件设计中的种种难处。并且你可以从中有一些软件设计上的收获。 最重要的希望这些基础知识可以让你搞清很多以前一些似是而非的东西,并且你能意识到基础的重要。 所以,本文不会面面俱到,只是对TCP协议、算法和原理的科普。 我本来只想写一个篇幅的文章的,但是TCP真TMD的复杂,比C++复杂多了,这30多年来,各种优化变种争论和修改。所以,写着写着就发现只有砍成两篇。 上篇中,主要向你介绍TCP协议的定义和丢包时的重传机制。 下篇中,重点介绍TCP的流迭、拥塞处理。 废话少说,首先,我们需要知道TCP在网络OSI的七层模型中的第四层——Transport层,IP在第三层——Network层,ARP在第二层——Data Link层,在第二层上的数据,我们叫Frame,在第三层上的数据叫Packet,第四层的数据叫Segment。 首先,我们需要知道,我们程序的数据首先会打到TCP的Segment中,然后TCP的Segment会打到IP的Packet中,然后再打到以太网Ethernet的Frame中,传到对端后,各个层解析自己的协议,然后把数据交给更高层的协议处理。 TCP头格式 接下来,我们来看一下TCP头的格式 你需要注意这么几点: TCP的包是没有IP地址的,那是IP层上的事。但是有源端口和目标端口。 一个TCP连接需要四个元组来表示是同一个连接(src_ip, src_port, dst_ip, dst_port)准确说是五元组,还有一个是协议。但因为这里只是说TCP协议,所以,这里我只说四元组。 注意上图中的四个非常重要的东西: Sequence Number是包的序号,用来解决网络包乱序(reordering)问题。 Acknowledgement Number就是ACK——用于确认收到,用来解决不丢包的问题。 Window又叫Advertised-Window,也就是著名的滑动窗口(Sliding Window),用于解决流控的。 TCP Flag ,也就是包的类型,主要是用于操控TCP的状态机的。 关于其它的东西,可以参看下面的图示 TCP的状态机 其实,网络上的传输是没有连接的,包括TCP也是一样的。而TCP所谓的“连接”,其实只不过是在通讯的双方维护一个“连接状态”,让它看上去好像有连接一样。所以,TCP的状态变换是非常重要的。 下面是:“TCP协议的状态机”(图片来源) 和 “TCP建链接”、“TCP断链接”、“传数据” 的对照图,我把两个图并排放在一起,这样方便在你对照着看。另外,下面这两个图非常非常的重要,你一定要记牢。(吐个槽:看到这样复杂的状态机,就知道这个协议有多复杂,复杂的东西总是有很多坑爹的事情,所以TCP协议其实也挺坑爹的) 很多人会问,为什么建链接要3次握手,断链接需要4次挥手? 对于建链接的3次握手,主要是要初始化Sequence Number 的初始值。通信的双方要互相通知对方自己的初始化的Sequence Number(缩写为ISN:Inital Sequence Number)——所以叫SYN,全称Synchronize Sequence Numbers。也就上图中的 x 和 y。这个号要作为以后的数据通信的序号,以保证应用层接收到的数据不会因为网络上的传输的问题而乱序(TCP会用这个序号来拼接数据)。 对于4次挥手,其实你仔细看是2次,因为TCP是全双工的,所以,发送方和接收方都需要Fin和Ack。只不过,有一方是被动的,所以看上去就成了所谓的4次挥手。如果两边同时断连接,那就会就进入到CLOSING状态,然后到达TIME_WAIT状态。下图是双方同时断连接的示意图(你同样可以对照着TCP状态机看): 另外,有几个事情需要注意一下: 关于建连接时SYN超时。试想一下,如果server端接到了clien发的SYN后回了SYN-ACK后client掉线了,server端没有收到client回来的ACK,那么,这个连接处于一个中间状态,即没成功,也没失败。于是,server端如果在一定时间内没有收到的TCP会重发SYN-ACK。在Linux下,默认重试次数为5次,重试的间隔时间从1s开始每次都翻售,5次的重试时间间隔为1s, 2s, 4s, 8s, 16s,总共31s,第5次发出后还要等32s都知道第5次也超时了,所以,总共需要 1s + 2s + 4s+ 8s+ 16s + 32s = 2^6 -1 = 63s,TCP才会把断开这个连接。 关于SYN Flood攻击。一些恶意的人就为此制造了SYN Flood攻击——给服务器发了一个SYN后,就下线了,于是服务器需要默认等63s才会断开连接,这样,攻击者就可以把服务器的syn连接的队列耗尽,让正常的连接请求不能处理。于是,Linux下给了一个叫tcp_syncookies的参数来应对这个事——当SYN队列满了后,TCP会通过源地址端口、目标地址端口和时间戳打造出一个特别的Sequence Number发回去(又叫cookie),如果是攻击者则不会有响应,如果是正常连接,则会把这个 SYN Cookie发回来,然后服务端可以通过cookie建连接(即使你不在SYN队列中)。请注意,请先千万别用tcp_syncookies来处理正常的大负载的连接的情况。因为,synccookies是妥协版的TCP协议,并不严谨。对于正常的请求,你应该调整三个TCP参数可供你选择,第一个是:tcp_synack_retries 可以用他来减少重试次数;第二个是:tcp_max_syn_backlog,可以增大SYN连接数;第三个是:tcp_abort_on_overflow 处理不过来干脆就直接拒绝连接了。 关于ISN的初始化。ISN是不能hard code的,不然会出问题的——比如:如果连接建好后始终用1来做ISN,如果client发了30个segment过去,但是网络断了,于是 client重连,又用了1做ISN,但是之前连接的那些包到了,于是就被当成了新连接的包,此时,client的Sequence Number 可能是3,而Server端认为client端的这个号是30了。全乱了。RFC793中说,ISN会和一个假的时钟绑在一起,这个时钟会在每4微秒对ISN做加一操作,直到超过2^32,又从0开始。这样,一个ISN的周期大约是4.55个小时。因为,我们假设我们的TCP Segment在网络上的存活时间不会超过Maximum Segment Lifetime(缩写为MSL - Wikipedia语条),所以,只要MSL的值小于4.55小时,那么,我们就不会重用到ISN。 关于 MSL 和 TIME_WAIT。通过上面的ISN的描述,相信你也知道MSL是怎么来的了。我们注意到,在TCP的状态图中,从TIME_WAIT状态到CLOSED状态,有一个超时设置,这个超时设置是 2*MSL(RFC793定义了MSL为2分钟,Linux设置成了30s)为什么要这有TIME_WAIT?为什么不直接给转成CLOSED状态呢?主要有两个原因:1)TIME_WAIT确保有足够的时间让对端收到了ACK,如果被动关闭的那方没有收到Ack,就会触发被动端重发Fin,一来一去正好2个MSL,2)有足够的时间让这个连接不会跟后面的连接混在一起(你要知道,有些自做主张的路由器会缓存IP数据包,如果连接被重用了,那么这些延迟收到的包就有可能会跟新连接混在一起)。你可以看看这篇文章《TIME_WAIT and its design implications for protocols and scalable client server systems》 关于TIME_WAIT数量太多。从上面的描述我们可以知道,TIME_WAIT是个很重要的状态,但是如果在大并发的短链接下,TIME_WAIT 就会太多,这也会消耗很多系统资源。只要搜一下,你就会发现,十有八九的处理方式都是教你设置两个参数,一个叫tcp_tw_reuse,另一个叫tcp_tw_recycle的参数,这两个参数默认值都是被关闭的,后者recyle比前者resue更为激进,resue要温柔一些。另外,如果使用tcp_tw_reuse,必需设置tcp_timestamps=1,否则无效。这里,你一定要注意,打开这两个参数会有比较大的坑——可能会让TCP连接出一些诡异的问题(因为如上述一样,如果不等待超时重用连接的话,新的连接可能会建不上。正如官方文档上说的一样“It should not be changed without advice/request of technical experts”)。 关于tcp_tw_reuse。官方文档上说tcp_tw_reuse 加上tcp_timestamps(又叫PAWS, for Protection Against Wrapped Sequence Numbers)可以保证协议的角度上的安全,但是你需要tcp_timestamps在两边都被打开(你可以读一下tcp_twsk_unique的源码 )。我个人估计还是有一些场景会有问题。 关于tcp_tw_recycle。如果是tcp_tw_recycle被打开了话,会假设对端开启了tcp_timestamps,然后会去比较时间戳,如果时间戳变大了,就可以重用。但是,如果对端是一个NAT网络的话(如:一个公司只用一个IP出公网)或是对端的IP被另一台重用了,这个事就复杂了。建链接的SYN可能就被直接丢掉了(你可能会看到connection time out的错误)(如果你想观摩一下Linux的内核代码,请参看源码 tcp_timewait_state_process)。 关于tcp_max_tw_buckets。这个是控制并发的TIME_WAIT的数量,默认值是180000,如果超限,那么,系统会把多的给destory掉,然后在日志里打一个警告(如:time wait bucket table overflow),官网文档说这个参数是用来对抗DDoS攻击的。也说的默认值180000并不小。这个还是需要根据实际情况考虑。 Again,使用tcp_tw_reuse和tcp_tw_recycle来解决TIME_WAIT的问题是非常非常危险的,因为这两个参数违反了TCP协议(RFC 1122) 其实,TIME_WAIT表示的是你主动断连接,所以,这就是所谓的“不作死不会死”。试想,如果让对端断连接,那么这个破问题就是对方的了,呵呵。另外,如果你的服务器是于HTTP服务器,那么设置一个HTTP的KeepAlive有多重要(浏览器会重用一个TCP连接来处理多个HTTP请求),然后让客户端去断链接(你要小心,浏览器可能会非常贪婪,他们不到万不得已不会主动断连接)。 数据传输中的Sequence Number 下图是我从Wireshark中截了个我在访问coolshell.cn时的有数据传输的图给你看一下,SeqNum是怎么变的。(使用Wireshark菜单中的Statistics ->Flow Graph… ) 你可以看到,SeqNum的增加是和传输的字节数相关的。上图中,三次握手后,来了两个Len:1440的包,而第二个包的SeqNum就成了1441。然后第一个ACK回的是1441,表示第一个1440收到了。 注意:如果你用Wireshark抓包程序看3次握手,你会发现SeqNum总是为0,不是这样的,Wireshark为了显示更友好,使用了Relative SeqNum——相对序号,你只要在右键菜单中的protocol preference 中取消掉就可以看到“Absolute SeqNum”了 TCP重传机制 TCP要保证所有的数据包都可以到达,所以,必需要有重传机制。 注意,接收端给发送端的Ack确认只会确认最后一个连续的包,比如,发送端发了1,2,3,4,5一共五份数据,接收端收到了1,2,于是回ack 3,然后收到了4(注意此时3没收到),此时的TCP会怎么办?我们要知道,因为正如前面所说的,SeqNum和Ack是以字节数为单位,所以ack的时候,不能跳着确认,只能确认最大的连续收到的包,不然,发送端就以为之前的都收到了。 超时重传机制 一种是不回ack,死等3,当发送方发现收不到3的ack超时后,会重传3。一旦接收方收到3后,会ack 回 4——意味着3和4都收到了。 但是,这种方式会有比较严重的问题,那就是因为要死等3,所以会导致4和5即便已经收到了,而发送方也完全不知道发生了什么事,因为没有收到Ack,所以,发送方可能会悲观地认为也丢了,所以有可能也会导致4和5的重传。 对此有两种选择: 一种是仅重传timeout的包。也就是第3份数据。 另一种是重传timeout后所有的数据,也就是第3,4,5这三份数据。 这两种方式有好也有不好。第一种会节省带宽,但是慢,第二种会快一点,但是会浪费带宽,也可能会有无用功。但总体来说都不好。因为都在等timeout,timeout可能会很长(在下篇会说TCP是怎么动态地计算出timeout的) 快速重传机制 于是,TCP引入了一种叫Fast Retransmit 的算法,不以时间驱动,而以数据驱动重传。也就是说,如果,包没有连续到达,就ack最后那个可能被丢了的包,如果发送方连续收到3次相同的ack,就重传。Fast Retransmit的好处是不用等timeout了再重传。 比如:如果发送方发出了1,2,3,4,5份数据,第一份先到送了,于是就ack回2,结果2因为某些原因没收到,3到达了,于是还是ack回2,后面的4和5都到了,但是还是ack回2,因为2还是没有收到,于是发送端收到了三个ack=2的确认,知道了2还没有到,于是就马上重转2。然后,接收端收到了2,此时因为3,4,5都收到了,于是ack回6。示意图如下: Fast Retransmit只解决了一个问题,就是timeout的问题,它依然面临一个艰难的选择,就是重转之前的一个还是重装所有的问题。对于上面的示例来说,是重传#2呢还是重传#2,#3,#4,#5呢?因为发送端并不清楚这连续的3个ack(2)是谁传回来的?也许发送端发了20份数据,是#6,#10,#20传来的呢。这样,发送端很有可能要重传从2到20的这堆数据(这就是某些TCP的实际的实现)。可见,这是一把双刃剑。 SACK 方法 另外一种更好的方式叫:Selective Acknowledgment (SACK)(参看RFC 2018),这种方式需要在TCP头里加一个SACK的东西,ACK还是Fast Retransmit的ACK,SACK则是汇报收到的数据碎版。参看下图: 这样,在发送端就可以根据回传的SACK来知道哪些数据到了,哪些没有到。于是就优化了Fast Retransmit的算法。当然,这个协议需要两边都支持。在 Linux下,可以通过tcp_sack参数打开这个功能(Linux 2.4后默认打开)。 这里还需要注意一个问题——接收方Reneging,所谓Reneging的意思就是接收方有权把已经报给发送端SACK里的数据给丢了。这样干是不被鼓励的,因为这个事会把问题复杂化了,但是,接收方这么做可能会有些极端情况,比如要把内存给别的更重要的东西。所以,发送方也不能完全依赖SACK,还是要依赖ACK,并维护Time-Out,如果后续的ACK没有增长,那么还是要把SACK的东西重传,另外,接收端这边永远不能把SACK的包标记为Ack。 注意:SACK会消费发送方的资源,试想,如果一个攻击者给数据发送方发一堆SACK的选项,这会导致发送方开始要重传甚至遍历已经发出的数据,这会消耗很多发送端的资源。详细的东西请参看《TCP SACK的性能权衡》 Duplicate SACK – 重复收到数据的问题 Duplicate SACK又称D-SACK,其主要使用了SACK来告诉发送方有哪些数据被重复接收了。RFC-2833 里有详细描述和示例。下面举几个例子(来源于RFC-2833) D-SACK使用了SACK的第一个段来做标志, 如果SACK的第一个段的范围被ACK所覆盖,那么就是D-SACK 如果SACK的第一个段的范围被SACK的第二个段覆盖,那么就是D-SACK 示例一:ACK丢包 下面的示例中,丢了两个ACK,所以,发送端重传了第一个数据包(3000-3499),于是接收端发现重复收到,于是回了一个SACK=3000-3500,因为ACK都到了4000意味着收到了4000之前的所有数据,所以这个SACK就是D-SACK——旨在告诉发送端我收到了重复的数据,而且我们的发送端还知道,数据包没有丢,丢的是ACK包。 … -
E-Commerce Platform Options
The big name in open source e-commerce these days is Magento. In my previous job it was just too early and Magento was pretty buggy but it now seems to be the number one of choice of people I talk to in the industry. The main reason I'm not keen on it is that it is quite a big code base to learn and secondly it is in PHP. My attitude to PHP is similar to most French people's attitude to English: I can speak the language but I find it very inelegant and I'm really not keen on using it day-to-day. Over in Python-land I have a few options. Django Shop seems the best bet as a framework for building from but it seems pretty early days and most of my needs are different so I would end up with the vast bulk of the code being custom. Incidentally I picked up Beginning Django E-Commerce which for me was probably a bit basic but I would thoroughly recommend it for anyone new to e-commerce or Django. -
Python Web Frameworks
I thought long and hard about which framework to choose for this project. My first exposure to Python came from Zope. I really don't like being negative about projects and technologies and I met some very nice people in the Zope community. However I think Zope was a major factor in dooming the project I was working on to failure. The problem we had was the our development was very slow and we had big scaling problems. Zope had a very steep learning curve and while it is theoretically possible to scale it makes life very difficult. I got the chance to see Zope deployed in a variety of larger settings and every single one struggled on scaling. The other issue for me (which is very subjective) is that I simply didn't enjoy developing with Zope, I felt the framework kept pushing me in the wrong direction. I looked at Twisted and I know some very major websites using it and the performance is unbelievable, in fact I would go as far as to say that is the best performing of any Python framework. However I simply couldn't understand it! I'm sure if I'd persevered I would have got it … -
Launch!
It happened! Our first sale today. Somebody actually came to our website and bought from us! The whole system is working pretty much as planned. I find that I make fewer mistakes when I use Django and everything is going so much faster than I'm used to. I suppose there are a few other factors at play: The team is tiny so there is never much discussion.The codebase is also tiny. It is less than 1% the size of the last e-commerce codebase I worked with so there simply are fewer things to go wrong and everything is very easy to understand.I'm working Python rather than PHP so I have a language pushing me to do the right thing. Onwards and upwards. -
Supplier Extranet
One of the big challenges in E-Commerce is managing stock, if you run out it is a disaster but if you order too much that's also a disaster! Working with the manufacturers is always because as soon as you get to any sort of scale you can't simply buy from them, you need to give them forecasts and help them prepare their supply chains too. I'm always keen to automate away tasks so I've created an extranet to allow my manufacturers to get real time information on rate of sale (of their products) and to view rate of sale (ROS) per SKU. The slightly tricky thing here is security. The last time I did this I was working with Zen Cart which doesn't have a concept of permissions: you are either an administrator or not. Fortunately Django's authentication framework is great, very easy to use and allows finer grained access rights. I was thinking about generating weekly emails but I've decided not to bother. The web is just a much system for this type of problem. -
Mea Culpa
Today we had a pretty bad software problem which cost us nearly half the day's sales. We then had another issue due to a failed fix (although this didn't cost any sales). At launch although we had many difficulties the software platform wasn't one of them. It worked very well. The reason was that after it was finished it was tested extensively by Clare before there was a live deployment. Her testing caught many, many errors. Post launch there has been a very substantial redevelopment of the code. In particular, introducing PayPal and introducing i18n (which is still underway) have resulted in very major changes. I have however been testing the code myself rather than putting it out to someone else. This has meant that some of the bugs have got through to customers. From now on, I will get Clare to test all major changes. The other issue has been that my deployment to the live server is not automatic. It was always the plan to be automated via git but despite spending half a day on it I couldn't get it to work. So it was left as a manual and complex process which needless to say eventually … -
Currencies
I think that in e-commerce it is vital to bill customers in their own currency. Certainly for us, with the UK accounting for such a tiny proportion of the Nespresso capsule market then export will be very important for us. I’ve finally round to deploying the completed infrastructure for euro billing. All it takes is a simple change in the config file and the site will start using euros. As soon as Streamline get themselves in gear we will be able to launch an Irish website in euros. It will also be much easier to add other currencies in time. I do really wish that Django had support for currencies rather than leaving it up to the developer and a quick Google shows that I am not alone in wanting it. There is at least a decimal type but that's not quite the same. The big advantage of currency being built in is that it would make it easier for the different e-commerce projects using Django to share code. -
Test Suite
I've spent the past few days putting together a proper test suite. It's been a lot of work, 813 lines of code and 159 specific tests. It covers the bug that caused the site to fail I blogged about and from now on whenever there is a bug I will add a test for it to the suite to ensure it never happens again. I'm also using coverage.py which is a great way of spotting what's still to be done. The tests are a mix of unit tests on very specific parts of the codebase and also functional tests that go completely through the process of signing up for the site and placing an order and also an existing customer placing an order. These test go right through to charging a card on the SagePay test servers. In a commercial environment it's very hard to get time to spend on quality. There's always a big to-do list and to say "Let's just pause for a while to improve quality" it a hard thing to say. The benefits are that I can now go a bit faster with development and make changes with more confidence as I can be more sure … -
First Euro Transaction and a Bug
Today was a bit of a milestone as we launched our Irish Nespresso capsule site using the sites framework and processed our first euro transaction. Everything seemed fine. Unfortunately things then went wrong and a lot of our UK customers ended up presented with euro pricing! Up to this point I'd only had one development server but I quickly setup two different dev servers, one for each country. After a while I was able to reproduce the error on the dev servers with the currencies appearing wrongly on the different sites. After a bit of Googling I came across this post but the wonderful Graham Dumpleton (is it wrong to love another man?). It turns out I actually stumbled across my first bug in Django but fortunately there is a simple fix. -
Porting to Zope
After much thought I've decided to port my application away from Django to Zope. I've just been finding things too straightforward with Django and I miss the days of struggling with problems only to discover they were bugs in the framework or the challenge of getting the ZODB to scale. I'm also really missing XML. The only thing that makes me happier than coding in Python is working in XML. Update: This was of course a very poor April Fools joke! -
Heroku
I hurt my back yesterday and as a result I'm stuck in bed. So I thought I would make use of the time by evaluating Heroku. I am overall very impressed and next time I do a project I will deploy it on Heroku from the start. However I've come across a few problems. The first one is that Heroku requires the use of a CNAME. Unfortunately CNAME's can only be used if the domain is in the form http://www.finecoffeeclub.co.uk/ rather than http://finecoffeeclub.co.uk/ Changing the address is out as I would need to re-do several extended validation SSL certificates. Certain DNS providers have their own method of doing a CNAME at the root but unfortunately Freeparking doesn't. I would also have to do some work around getting a fixed IP endpoint for Sagepay, use of the temporary file system and setting up storage for the static assets. On the other hand: I have no scaling issues with the present infrastructure, indeed I forecast a 50 to 100 times more traffic on the present single server and to date there hasn't been a problem with the infrastructure. I do though have quite a lot of other things to do overall migrating … -
MailChimp
One way or another e-commerce tends to involve quite a bit of email! I had thought about doing it myself and I did run a site a long time ago which fired out a lot of email and you can get it to work but it really is jolly hard work. You need to implement SPF and DomainKeys and also get in contact with Microsoft and Yahoo! if you want your mail delivered (Google, as usually, just works perfectly without any hassle at all). Anyway, I've got too many other important tasks so I am using MailChimp. They have a brilliant API which allows really good reporting and very detailed (and easy!) segmentation. It is so nice for non-technical people to be able to mail a segment without requiring custom SQL. I've written a cronjob which calls a Django management command every day to upload my new customers and checkout abandonments. I'm also use MailChimp web hooks to call back to Django when customers unsubscribe. I've added MailChimp's e-com 360 tracking to Django so we've got reports of revenue from each campaign. I've used some incredibly expensive enterprise mail packages before but MailChimp is quite simply streets ahead and tens … -
The American Dream
Americans are beginning to get the Nespresso bug so I think the time has come for us to launch a USA Nespresso capsule site. The good news is that the currency work is already done so all we need to do is fix the English. The Americans really are much more sensible than us when it comes to writing English, they do cut out a whole lot of unnecessary vowels and generally spell words closer to their pronunciation. Django has great i18n and I'm translating from en-gb to en-us. The only issue is that out of the box the LocaleMiddleware sets the language based on the web browser. However I want the language set based on which URL the user is on. Fortunately it's a trivial thing to fix. All I've done is put this into my custom middleware.py: request.LANGUAGE_CODE = settings.LANGUAGE_CODE And then I can set LANGUAGE_CODE in the relevant settings.py One thing that slightly got me is that locale codes are different to language codes. A locale code looks like en_US while a language code is en-us. It does matter which you use. -
Encoding Problem
It was pointed out today that the euro sign was not displaying properly on the news page. I had a look and I saw that the problem is that the database encoding is set to Latin-1 instead of UTF-8. It turns out that was the default on the version Debian I started with and as I've dumped and restored the database the encoding has remained the same. The solution is relatively simple which is to dump the database and reload with the correct encoding. This could be done during a short maintenance window in the middle of the night. But no! As the sun no longer sets on the Fine Coffee Club empire there is no particularly good time when we can shut down for maintenance. Update: It turned out to only require a couple of minutes of outage. However the point remains. -
Currency Refactor
This is technical notes just for me on the currency refactor. The background is that currency is presently on the presentation layer in the templates. However as I'm expanding currencies this is getting more complex and there is also a problem that invoices are generated as a PDF so I have to do the currency code twice: once for the templates and once for the PDF which break the DRY (Do not Repeat Yourself) rule. The code has to deal with the following: Changing currency symbol, e.g. £1.00 and $1.00.Changing decimal separator, e.g. €1.00 and €1,00Changing currency symbol position, e.g. €1.00 and 1,00€Different currencies with the same symbol e.g. Canadian dollars (CAD) and US dollars (USD). An additional requirement is that I want to be able to add currencies relatively frequently without changing too much code. Pushing this down into the model layer should make things a bit easier but there is quite a lot of complexity that my design has to cope with. Products and Shipment Methods have multiple prices, one for each currency, chosen by settings.pyBaskets have one currency for multiple totals depending on settings.pyShipment and Order have the currency in the model and multiple totalsAdmin interface reports … -
Testing Update
I'm off on holiday on Friday for a week without much in the way of internet access so I'm going to have a code freeze as of now. The final code I committed was i18n for the French launch. Unfortunately there was a bug in one of the live site settings files which caused some problems on the Irish site. I fixed it and wrote a test which checks all the major settings across all the countries. I also ran coverage.py on the codebase which shows the percent of the code covered by tests and can generate a nice HTML report. At the moment it is 61%. During the code freeze I am going to focus on adding unit tests to try get the coverage over 70% this week. In other news, outsourcing of fulfilment is going well with Seko chosen who have warehousing in the USA, UK, Europe, Australia and other locations giving us a complete global presence. They are really a bit big for us but that means we can stick with them long term. There's going to be quite a lot of work in integrating into their API but it is very well documented. It also means … -
Django / ZenDesk API
Anna was off and I was left doing all of the email support which we do through the excellent Zendesk. After about 10 minutes I has deeply frustrated with the process of finding customer orders on our system. So I've written a simple application which links straight from a ticket to the relevant orders with just a couple of clicks. It has saved an enormous amount of time. I'm going to generalise this slightly by getting it to bring up the Django user and then I'll publish it as the first bit of code to be released. -
Django Errors Going to Spam
Like many sites I've setup Django to email me when there are errors. Unfortunately the errors were all ending up in my spam. I had a look at the mail system and there were no problems there. The first issue was very simple. It'd not set SERVER_EMAIL in settings.py which meant that the emails were being sent from 'root@localhost' which spam filters are not going to like. The second problem was the very large number of emails which was also a problem for me in that I stopped taking the emails seriously. The Django ALLOWED_HOSTS setting improves security but generates a lot of errors (this has been fixed in the development version of Django but for now it fills up the log). I found an excellent article Prevent email notification on SuspiciousOperation with detailed code which I've implemented and it's made a huge difference.