Ryan
3d7a52338d
remove the scrapers
2020-05-01 13:11:32 -07:00
Simon Lipp
f077c13882
scraper-yggtorrent: bugfix
2017-08-12 19:30:11 +02:00
Simon Lipp
2d296bb526
new scraper: bookys
2017-08-10 10:36:34 +02:00
Simon Lipp
b615704d54
scraper-yggtorrent: fix title parsing
2017-08-10 10:33:02 +02:00
Simon Lipp
0efc90c9f9
scraper-yggtorrent: typo
2017-07-27 15:55:23 +02:00
Simon Lipp
28054b6bfd
Scrapers overhaul
...
* Switch all python scrapers to scrapy
* Allow scrapers to be directly called, instead of
using `scrapy runspider`
* Prefix scapers with `ua-scraper-` for clarity
* Update documentation
2017-07-26 12:08:32 +02:00
Simon Lipp
874c844f78
remove unmaintained scrapers
...
ipboard2json can be replaced by weboobmsg2json + weboob tapatalk module
2017-07-26 09:39:19 +02:00
Simon Lipp
df56355fe1
new scraper for yggtorrent.com
2017-07-26 09:36:06 +02:00
Simon Lipp
913f961c42
new scraper for torrent9
2017-07-26 09:10:18 +02:00
Simon Lipp
b0b5ea7a9a
python3 support for some scrapers
2017-07-26 09:10:18 +02:00
Simon Lipp
4b20c6d92e
remove torrent9 scraper (not working anymore)
2017-07-26 09:10:18 +02:00
Simon Lipp
325dab92c6
bugfix
2017-02-24 08:37:47 +01:00
Simon Lipp
e4168d7b4f
ua: update doc
2017-02-23 14:02:10 +01:00
Simon Lipp
1f7d3ab3a1
ua-proxify: add doc
2017-02-23 14:01:39 +01:00
Simon Lipp
84b48c6205
ggs: add support for -once
2017-02-23 14:00:24 +01:00
Simon Lipp
e658b62bab
ggs: better logging & timeout handling
2017-02-23 13:57:47 +01:00
Simon Lipp
e0c1edec6d
maildir-put: change redis cache storage format
2017-01-11 16:11:23 +01:00
Simon Lipp
5d517ae1ff
edxcourses: add session to id
2017-01-11 16:10:48 +01:00
Simon Lipp
25a2bd713f
scraper for torrent9
2016-12-26 19:17:06 +01:00
Simon Lipp
80cbd4a9cc
scraper for bm-lyon
2016-12-26 19:08:20 +01:00
Simon Lipp
c1c70412fc
new scraper for t411.li
2016-12-26 19:05:55 +01:00
Simon Lipp
4537e967e9
new scrapper for myanimelist.net
2016-12-26 18:35:57 +01:00
Simon Lipp
cb327a5dbc
ggs: use jq for json generation in configuration file
2016-04-04 09:27:59 +02:00
Simon Lipp
4e54d30fa2
ggs: reload configuration when receiving SIGUSR1
2016-04-01 14:55:54 +02:00
Simon Lipp
2941ef15df
ggs: get rid of global variables
2016-03-29 10:35:10 +02:00
Simon Lipp
83e1c1c308
update .gitignore
2016-03-23 13:56:04 +01:00
Simon Lipp
83504e2944
edxcourses: add date
2016-03-23 13:55:02 +01:00
Simon Lipp
896ce9b657
new filter: ua-proxify
2016-03-23 13:54:40 +01:00
Simon Lipp
2ee32d8d81
ua-inline: use attachment for images
2016-03-21 11:56:25 +01:00
Simon Lipp
be51bd77f6
edx-courses scrapper: fixes
2016-02-26 10:20:45 +01:00
Simon Lipp
f1d31d9505
add scrapper edxcourses
2016-02-25 15:42:01 +01:00
Simon Lipp
62d67fc0ae
update doc
2016-02-12 09:10:13 +01:00
Simon Lipp
49b68e3b7c
maildir-put: add redis support for messages cache
2016-02-12 09:06:59 +01:00
Simon Lipp
de86e77838
Use default $GOPATH
2016-02-11 17:12:57 +01:00
Simon Lipp
99e4299b18
gofmt
2016-02-11 14:40:33 +01:00
Simon Lipp
36f0c8392e
New scrapper: weboobmsg2json
...
Use [weboob](http://weboob.org ) backends to get messages
2016-02-10 10:47:26 +01:00
Simon Lipp
52176a4b1d
ipboard2json: shorten message ids
...
Since hostname is already present is right part, it’s useless to also
put it in the left part of the ID.
2016-02-09 14:05:45 +01:00
Simon Lipp
7e49bf3b76
maildir-put: better msg-id encoding
...
maildir-put currently uses a very crude scheme to generate
RFC2822-compliant message ids from ids provided by scrappers: it just
sha256-encode the ID.
Since this is not exactly optimal for debugging purposes, change this by
properly encoding message ids according to RFC2822.
2016-02-09 11:12:10 +01:00
Simon Lipp
7c35a852eb
ipboard scrapper: add type=cite attribute to blockquote tags
...
This allow certain UA (Thunderbird) to present the tag as a citation
2016-02-03 10:24:13 +01:00
Simon Lipp
ac15762448
scraplib.py: close response one redirection
2016-02-03 10:23:08 +01:00
Simon Lipp
4b20fe9c64
ua-inline: inline <style> tags
2016-02-03 10:22:19 +01:00
Simon Lipp
32b584e7fe
maildir-put: use LF line endings
2016-02-03 10:21:40 +01:00
Simon Lipp
661ca5e4e5
Fix compilation issue
2016-02-03 10:15:11 +01:00
Simon Lipp
d37c513fb6
update Makefile
2014-03-22 16:39:13 +01:00
Simon Lipp
4b1d4af8b2
new scrapper: medscape2json
2014-03-22 16:38:06 +01:00
Simon Lipp
f4355b8b1a
scraplib.py: add a non persistant cookiejar
2014-03-22 15:49:11 +01:00
Simon Lipp
e03c90d635
scraplib.py: simplify cookie managment
2014-03-22 15:48:46 +01:00
Simon Lipp
7913eb9d77
add usage for ipboard2json and mangareader2json
2014-03-22 11:34:28 +01:00
Simon Lipp
00be42bdd9
Fix a compilation problem when GOPATH is not set
2014-03-18 16:56:58 +01:00
Simon Lipp
78b8a6a447
initial import
2014-03-18 15:01:19 +01:00