月度归档:2015年05月

mysql优化记录

两组结果一样,写法相近的mysql语句,但性能相差接近100倍,是何缘故。
EXPLAIN SELECT * FROM wareprice_01 wp WHERE id=(SELECT MAX(id) FROM wareprice_01 p WHERE p.wareid IN (44166981760,43770830980,43770832630,44976304870,43994294220,38585515990,45140073950) AND wp.wareid = p.wareid);

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY wp ALL \N \N \N \N 90770 Using where
2 DEPENDENT SUBQUERY p ref wareidp wareidp 8 tmware.wp.wareid 1 Using where

EXPLAIN SELECT * FROM wareprice_01 wp WHERE id=(SELECT MAX(id) FROM wareprice_01 p WHERE wp.wareid = p.wareid) AND wp.wareid IN (44166981760,43770830980,43770832630,44976304870,43994294220,38585515990,45140073950);

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY wp range wareidp wareidp 8 \N 8 Using where
2 DEPENDENT SUBQUERY p ref wareidp wareidp 8 tmware.wp.wareid 1

代理:唯快不破

网上免费代理失效率很高,其原因不外乎以下几种:
1.大量代理代理服务器,主要是由肉鸡主机构成,有些是用户机被木马了,也有一些是VPS,其中也有一些高质量的肉鸡服务器,但这种高质量的代理一般不会免费提供给用户。
2.网上太多人在玩爬虫类或群发类的开发,对这种免费代理的需求较高,这么多人在使用这些质量较差的代理,极有可能导致代理处理不过来而宕机
3.网管发现了,关闭了代理。
4.修复了漏洞。

所以说要想抢得更好的代理资源,需要如下:
1.新代理刚发现,就马上测试并使用,不要考虑存放太久,否则被别人使用过多了,它就更容易失效。
2.在进行代理可行性测试时,避免不必要的浪费测试。
3.测试的代理的连通性,必须选择高质量地址,首先CDN资源如query.min.js,再到一些大站如qq首页等。

可获取公网IP的网址

由于代理检验需要,现在小站经受不住大流量测试,于是多收集了一些。
http://1111.ip138.com/ic.asp,
http://ip.360.cn/IPShare/info,
http://www.ip508.com/ip,
http://myip.com.tw/,
http://ip.xianhua.com.cn/,
http://www.ip.cn/,
http://www.123cha.com/ip,
http://www.ip38.com/,
http://ip.chinaz.com,
http://www.cz88.net/ip/index.aspx,
——————————————-
以下是淘宝的,量大,再加上它天猫和淘宝都需要使用,应该能扛得住大流量,直得一试。
http://www.taobao.com/help/getip.php,

IP位置定位参考库

当前有两大IP参考库,记录如下:
http://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-latest,这是APNIC亚太地区的IP分配地址表,全球有五大IP分配机构,其它是AfriNIC非洲地区、ARIN美洲地区、LACNIC拉丁美洲和加勒比海、RIPE欧洲地区。
ftp://ftp.arin.net/pub/stats/arin/delegated-arin-latest,
ftp://ftp.ripe.net/ripe/stats/delegated-ripencc-latest,
ftp://ftp.afrinic.net/pub/stats/afrinic/delegated-afrinic-latest,
ftp://ftp.apnic.net/pub/stats/apnic/delegated-apnic-latest,
ftp://ftp.lacnic.net/pub/stats/lacnic/delegated-lacnic-latest ,
http://www.cz88.net/,这是国内有名的纯真库。
python中的ip和int转换算法:

def ip_to_int(iptxt):
ipf = lambda x:sum([256**j*int(i) for j,i in enumerate(x.split(‘.’)[::-1])])
return ipf(iptxt)

def int_to_ip(ip):
ipf = lambda x: ‘.’.join([str(x/(256**i)%256) for i in range(3,-1,-1)])
return ipf(ip)

前端播放器资源

http://www.52player.com/
以上网站收集了当前各种流行的前端播放器,很值得收藏。
以下是本人下载并测试的播放器:
http://pan.baidu.com/s/1mg7n8yk,该功能相对简单,一般情况够用。
http://pan.baidu.com/s/1dD2IgFZ,该功能丰富,提供多种形式的播放器:如列表模式,JS交互模式,经典模式等,真是包括各种情况。官方主页为http://www.alsacreations.fr/dewplayer.html,教程也相当丰富。
http://www.spencer-tech.com/,开源的播放器,可以自由修改(http://pan.baidu.com/s/1sjkcOFV),有三种模式,MINI模式、列表模式、列表专辑模式,皮肤和大小可自由调整,是目前最好的音乐控件。

scrapy所依赖的环境各版本

目前稳定使用scrapy的依赖环境如下:
pip install Twisted==14.0.2 w3lib==1.11.0 queuelib==1.2.2 lxml==3.3.0 pyOpenSSL==0.12 cssselect==0.9.1 six==1.9.0 scrapy==0.24.4 pycurl==7.19.5.1 django==1.4.18 mysql-python==1.2.5 service_identity==14.0.0 selenium==2.44.0 simplejson==3.6.5
pip install Twisted==14.0.2 https://pypi.python.org/packages/source/T/Twisted/Twisted-14.0.2.tar.bz2
pip install w3lib==1.11.0
pip install queuelib==1.2.2
pip install lxml==3.3.0
pip install pyOpenSSL==0.12
pip install cssselect==0.9.1
pip install six==1.9.0
pip install scrapy==0.24.4
pip install pycurl==7.19.5.1
pip install django==1.4.18
pip install mysql-python==1.2.5
pip install service_identity==14.0.0
pip install selenium==2.44.0
pip install simplejson==3.6.5
【目前列出的版本为在Centos5.8上运行的版本,现移植到6.6后,仍保持一致,按如下版本安装,可解决SSL或HTTPS的崩溃BUG,2015年8月14日备注
characteristic (14.3.0)
cssselect (0.9.1)
Django (1.4.18)
lxml (3.3.0)
MySQL-python (1.2.5)
pip (1.5.6)
pyasn1 (0.1.7)
pyasn1-modules (0.0.5)
pycurl (7.19.5.1)
pyOpenSSL (0.12)
pypm (1.4.3)
queuelib (1.2.2)
Scrapy (0.24.4)
selenium (2.44.0)
service-identity (14.0.0)
setuptools (5.2)
simplejson (3.6.5)
six (1.9.0)
Twisted (14.0.2)
w3lib (1.11.0)
wsgiref (0.1.2)
zope.interface (4.1.2)

scrapy _getEndpoint() takes exactly 4 arguments (2 given)

Twisted 15.0 appears to have changed the signature of the _getEndpoint method on twisted.web.client.Agent. This causes the http11 handler to throw exceptions like so:
Traceback (most recent call last):
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/middleware.py”, line 38, in process_request
return download_func(request=request, spider=spider)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py”, line 123, in _enqueue_request
self._process_queue(spider, slot)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py”, line 143, in _process_queue
dfd = self._download(slot, request, spider)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/__init__.py”, line 154, in _download
dfd = mustbe_deferred(self.handlers.download_request, request, spider)

File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/utils/defer.py”, line 39, in mustbe_deferred
result = f(*args, **kw)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/__init__.py”, line 40, in download_request
return handler(request, spider)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/http11.py”, line 36, in download_request
return agent.download_request(request)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/scrapy/core/downloader/handlers/http11.py”, line 174, in download_request
d = agent.request(method, url, headers, bodyproducer)
File “/usr/share/python/spotify-prelude2-directed-crawlers/local/lib/python2.7/site-packages/twisted/web/client.py”, line 1560, in request
endpoint = self._getEndpoint(parsedURI)
exceptions.TypeError: _getEndpoint() takes exactly 4 arguments (2 given)

That method’s signature in Twisted 15.0.0 is def _getEndpoint(self, uri): while in version 14.0.2 it isdef _getEndpoint(self, scheme, host, port):

centos5.8下调试PHP

1.使用阿里lnmp一键安装包安装lnmp环境(php5.4.x)。http://xiazai.jb51.net/201407/tools/aliyun-sh-1.3.0.rar
2.安装phpstorm8.0.1
3.安装xdebug-2.2.2,以下php.ini参数配置是结合wamp2.4分析下完成。

tar zxvf xdebug-XDEBUG_2_2_2.tar.gz
cd xdebug-XDEBUG_2_2_2
phpize
./configure –enable-xdebug –with-php-config=/alidata/server/php-5.4.23/bin/php-config
make
cp modules/xdebug.so /alidata/server/php-5.4.23/bin/xdebug.so
修改php.ini(建议在phpinfo();里面看php.ini文件路径)
vim /alidata/server/php-5.3.18/etc/php.ini
新增以下:
[Xdebug]
zend_extension =/alidata/server/php-5.4.23/bin/xdebug.so
xdebug.remote_enable = On
xdebug.profiler_enable = On
xdebug.profiler_enable_trigger = On

xdebug.auto_trace = on
xdebug.auto_profile = on
xdebug.collect_params = on
xdebug.collect_return = on
xdebug.profiler_enable = on
xdebug.trace_output_dir = “/alidata/log/xdebug”
xdebug.profiler_output_dir = “/alidata/log/xdebug”
xdebug.dump.GET = *
xdebug.dump.POST = *
xdebug.dump.COOKIE = *
xdebug.dump.SESSION = *
xdebug.var_display_max_data = 4056
xdebug.var_display_max_depth = 5

**********************************************************************
注意:如果您安装过zend optimizer或者ZendGuardLoader那么,注意您的php.ini文件中是否已存在zend_extension= ,
如果已经存在,请注释掉,在zend_extension=前加上“;” ,这样,才可以正常安装运行好xdebug.示例:
;zend_extension=”/usr/local/lib/php/20060613/ZendExtensionManager.so”
;zend_extension=/alidata/server/php/lib/php/extensions/no-debug-non-zts-20100525/ZendGuardLoader.so
**********************************************************************
cd /tmp
mkdir xdebug

chmod -R 777 xdebug/
——————————————————————————————-
以下是完整的wamp2.4的配置,仅参考使用。

; XDEBUG Extension

zend_extension = “f:/wamp/bin/php/php5.4.16/zend_ext/php_xdebug-2.2.3-5.4-vc9.dll”

[xdebug]
xdebug.remote_enable = On
xdebug.profiler_enable = On
xdebug.profiler_enable_trigger = On
xdebug.profiler_output_name = cachegrind.out.%t.%p
xdebug.profiler_output_dir = “f:/wamp/tmp”