Top   Search   Create new records   

Scripts Overview

# Script Name Script Status Products in Database Items Mined time scanning end local elapsed time scanning Shop Statistics Json link Warnings Cron Schedule Latest Change Known Issues Description Work Todo Webkit or Browser simple Sitemap Affiliate
1 bike24.com ready 118357 112417 2019-Dec-05 18:10:46 13:21:14 [click] [click] @weekly 2019-12-04     -     -     - simple yes no
2 wiggle.co.uk ready 67117 26255 2019-Dec-09 21:53:08 02:40:00 [click] [click] @daily 2019-10-28     -     -     - simple yes yes
3 wiggle.dk ready 63134 25811 2019-Dec-09 21:42:49 02:16:41 [click] [click] @daily 2019-11-21     -     -     - simple yes yes
4 bikester.dk ready 54180 43442 2019-Dec-09 13:41:10 08:13:03 [click] [click] @daily 2019-10-28     -     -     - simple yes no
5 cykelkraft.dk ready 29227 20563 2019-Dec-09 18:54:35 10:04:40 [click] [click] @weekly 2019-11-27     - (Description is still in swedish on thier .dk site. (2019-10-01))     - simple no no
6 cykelpartner.dk ready 28602 0 2019-Dec-09 23:32:14 00:00:01 [click] [click] all records missing field: item_price @daily 2019-11-07     -     -     - simple no yes
7 bike-discount.de ready 25520 23628 2019-Dec-09 09:55:33 04:54:26 [click] [click] 14 Record(s) from last mined in json file have item_price: 0 @daily 2019-11-27     -     -     - simple yes no
8 tredz.co.uk ready 24344 22890 2019-Dec-09 05:40:49 01:45:41 [click] [click] @weekly 2019-11-25 fix __match and __mine function to be used properly and a general cleaning. Issues with handling two differend breadcrumbs setups. one from url when no breadcrumbs is there and a normal one. see notes tagged (Breadcrumb). 2019/09/19 (ch: This script excludes the specific brands by excluding from sitemap and looking at the meta brand tag. 2019/11/11) simple yes yes
9 cykelgear.dk ready 23174 21936 2019-Dec-05 20:25:06 00:59:58 [click] [click] 8 Record(s) from last mined in json file have item_price: 0 @weekly 2019-11-14     -     -     - simple yes no
10 rosebikes.com ready 17248 13628 2019-Dec-05 14:44:00 01:14:36 [click] [click] @weekly 2019-11-25     -     -     - simple yes no
11 allbike.dk ready 15712 11831 2019-Dec-07 15:38:57 03:46:49 [click] [click] 5 Record(s) from last mined in json file have item_price: 0 @weekly 2019-11-20     -     -     - simple yes no
12 rutlandcycling.com ready 12736 11847 2019-Dec-08 13:15:50 04:58:41 [click] [click] @weekly 2019-10-03     - (ch: item_invalid, in stock problem. There are some items that even though it shows as out of stock in simple browser, they are really in stock on their site, don't know how to fix this issue 2019/10/03), (ch: Breadrumb issue with breadcrumbs that contains spaces like "Gloves & Mittens", "Storage & Transport". If the dilimiter for breadcrumbs was something like " | " or " > " it would make it easier to determine the item_type_text. Also description_html issue: see the last two urlQueueAppends(). Chose to go with: breadcrumbs__fromUrl , but not really a valid solution, anyhow. 2019-10-02)     - simple yes yes
13 boerkopcykler.dk ready 12733 5073 2019-Dec-09 13:19:20 01:00:12 [click] [click] @daily 2019-11-08     -     - Uses match functions, (ch: redirect loop stille persisted, so trying to not using sitemap, and removing the use if '?a'. 2019/11/08) ( ch: redirect loop is sovled with appending '?a' to each product url 05/11/2019) simple yes no
14 designcykler.dk ready 11900 9247 2019-Dec-09 12:12:50 01:02:42 [click] [click] @daily 2019-08-12     -     -     - simple yes no
15 evanscycles.com broken 11369 9760 2019-Sep-22 15:30:00 01:44:51 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
4 Record(s) from last mined in json file have item_price: 0
@weekly 2019-09-24     - price issues, with extra price being put on the gpb price and either '/en-dk/ ' or '?currencyIsoCode=DKK&countryIsoCode=DK' seem to work.     - simple yes yes
16 eu.sourcebmx.com ready 10445 5666 2019-Dec-06 21:41:11 05:53:03 [click] [click] 1 Record(s) from last mined in json file have item_price: 0 @weekly 2019-10-09     -     -     - simple yes no
17 bergfreunde.dk ready 9673 9463 2019-Dec-09 23:31:45 01:32:37 [click] [click] @daily 2019-06-13     - can't find colors. (minor)     - simple yes yes
18 probikekit.co.uk ready 9359 5574 2019-Dec-04 21:49:25 00:18:16 [click] [click] 53 Record(s) from last mined in json file have item_price: 0 @weekly 2019-10-10     -     -     - simple yes yes
19 bikesport.dk ready 7906 7675 2019-Dec-04 16:18:41 01:40:32 [click] [click] @weekly 2019-10-09 navigator: (ch: issue with description_html, some products we want from value and some we want from just description_html, mod#2 problm.)     - simple yes no
20 cyclingfreak.dk ready 7798 2358 2019-Dec-09 23:44:44 00:13:20 [click] [click] @weekly 2019-10-07     - The site no longer have more bikes. Only clothes and components it seems     -     - yes no
21 cyclestore.dk ready 7049 5900 2019-Dec-09 19:39:12 02:37:04 [click] [click] @weekly 2019-11-22     -     -     - simple yes no
22 cykelshoppen.dk ready 6764 5071 2019-Dec-09 07:18:40 00:35:33 [click] [click] @daily 2019-06-25     -     - Simple browser using match function without sitemap simple no no
23 virtucyclinggear.com ready 6744 1888 2019-Dec-04 19:07:25 00:17:16 [click] [click] @weekly 2019-10-11     -     - Homepages have no bikes and almost no components simple no no
24 merlincycles.com broken 6614 5418 2019-Sep-24 05:28:42 00:08:34 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
@weekly 2019-08-25     - (ch: i don't know how to fix this one and have spent quite some time on this script, so disabling it as the prices are wrong 2019/09/25), (Doesn't run well on webkit mode and can't change to browser simple as i can't find a way to change delivery country and the prices would therefor be wrong. 2019-08-20)     - simple no no
25 cykelexperten.dk ready 6580 5740 2019-Dec-09 00:42:15 02:14:06 [click] [click] @daily 2019-10-21     - this scripts downloads alot of unnessercy sitemaps, would like to be able to filter them out with a include("product", "contains"), sitemapLocationsInclude would not work in this case as it's not the single links but the sitemaps links such as: "http://cykelexperten.dk/product-sitemap1.xml"     - simple yes yes
26 mantel.com ready 5558 0 2019-Dec-09 20:04:08 00:00:00 [click] [click] all records missing field: item_price @weekly 2019-06-03     -     -     -     - yes no
27 hargrovescycles.co.uk ready 5526 5527 2019-Dec-06 14:32:47 03:36:01 [click] [click] products in DB is less then mined. perhaps a import error @weekly 2019-12-04     -     -     - simple yes yes
28 tweekscycles.com broken 5176 5177 2019-May-11 15:49:34 03:19:00 [click] [click] status is sat to broken
products in DB is less then mined. perhaps a import error
Last data imported was more then 30 days ago.
@weekly 2019-07-04     - (2019-06-17) Download of https://www.tweekscycles.com/ failed: Error transferring https://www.tweekscycles.com/ - server replied: Not allowed. At least on the main server when running the script. Testing to see if the issue still exists - issue still exists (2019/07/10) simple yes yes
29 planetx.co.uk ready 3763 2633 2019-Dec-08 13:30:56 00:14:46 [click] [click] @weekly 2019-10-10     -     -     - simple no no
30 blacksnow.dk ready 3689 2 2019-Dec-05 21:39:05 00:28:53 [click] [click] all records missing field: item_price @weekly 2019-03-14     - Not the best domFind for price, so keep an eye out for prices on this one.     -     - yes no
31 juhlcycling.dk ready 3629 1530 2019-Dec-09 13:07:08 00:10:59 [click] [click] @daily 2019-03-13     -     -     -     - no no
32 bikeworld.dk ready 3420 2409 2019-Dec-09 14:08:20 03:27:12 [click] [click] @daily 2019-08-13     -     -     - simple yes no
33 bikeandco.dk ready 3023 2416 2019-Dec-05 05:49:03 00:37:43 [click] [click] @weekly 2019-10-02     -     -     -     - no no
34 fribikeshop.dk ready 2956 2353 2019-Dec-09 22:51:04 00:27:56 [click] [click] @daily 2019-10-11     -     -     - simple no no
35 canyon.com ready 2711 1472 2019-Dec-05 09:20:37 00:48:29 [click] [click] @weekly 2019-07-11     -     -     - simple no no
36 ribblecycles.co.uk ready 2539 2366 2019-Dec-04 23:16:26 00:34:53 [click] [click] 21 Record(s) from last mined in json file have item_price: 0 @weekly 2019-10-21     -     -     - simple yes yes
37 altsport.dk broken 2314 0 2019-Jun-10 05:18:54 00:00:01 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
all records missing field: item_price
@weekly 2019-03-18     - Broken: SSL handshake failed (25/06/2019), Can't load an url from them in netscavator     - simple no no
38 heino-cykler.dk ready 2189 2109 2019-Dec-03 01:34:01 00:50:53 [click] [click] @weekly 2019-11-14     -     -     - simple yes no
39 ecykelhjelm.dk ready 2021 1676 2019-Dec-06 12:00:04 00:12:00 [click] [click] @weekly 2019-10-02     - (ch: Issues with loading in urls from thier sitemap. See "Issue Sitemap Urls" in bottom of script 2019/10/02)     - simple no yes
40 jensencykler.com ready 1900 0 2019-Dec-06 20:04:09 00:00:01 [click] [click] all records missing field: item_price @weekly 2019-10-03     -     - (ch: Haven't found an out-of-stock item yet. 2019/10/03) simple no no
41 coolshop.dk broken 1713 986 2019-Sep-06 05:01:42 12:16:34 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
@weekly 2019-06-28     - ran for 12 days until noticed and aborted. See build #60 on jenkins. Cralws only relavent categories detected in breadcrumbs simple no no
42 sportactives.dk ready 1703 1529 2019-Dec-08 00:20:20 00:09:11 [click] [click] @weekly 2019-10-10     - (ch: Only takes the first bit of description. 2018-10-03) - (ch: still an issue 2019-10-10)     - simple yes no
43 snowfun.dk broken 1668 0 2019-Oct-08 03:25:39 00:09:25 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
all records missing field: item_price
@weekly 2019-10-11     - (ch: Script only works locally for me, on jenkins it will get stuck on links not working as /da/bike/ is inserted on each link. See #14 and #15 on jenkins. 2019/10/11), (ch: Their product links on thier sitemap doesnt work, found the solution and trying it with filterSitemapUrl() 2019/10/11)     - simple yes no
44 pulsure.dk ready 1546 1543 2019-Dec-06 12:49:53 00:05:04 [click] [click] @weekly 2019-05-24     -     -     - simple yes no
45 xxl.dk ready 1387 860 2019-Dec-09 13:22:27 00:04:31 [click] [click] @weekly 2019-10-07     -     - Uses SiteMapLocationsIncluded('...', 'contains') on "cykel", "rygsak", "mountainbike", "chain-locks", "abus", "floor-pump", "cube-stereo", "bike", "mtb", "cube", "bicycle", "pedal" og "kickstand". simple yes no
46 bikedesign.dk broken 1346 1 2019-Jul-15 12:56:58 00:00:03 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
@weekly 2019-07-16     - Script have issues in webkit and when running in browser simple mode it gets in a redirect loop on the categorie pages. (16/07/2019) This script uses browserScrollDownWait() simple no no
47 ecykler.dk ready 1221 0 2019-Dec-09 01:01:41 00:00:33 [click] [click] all records missing field: item_price @weekly 2019-07-01     -     -     - simple no no
48 cykler.dk ready 1193 1170 2019-Dec-06 04:32:47 00:19:39 [click] [click] 5 Record(s) from last mined in json file have item_price: 0 @weekly 2019-11-29     -     - getting colors in this script required some effort. simple yes yes
49 only4kids.dk ready 1045 155 2019-Dec-08 19:50:10 00:03:02 [click] [click] @weekly 2019-06-21     - Crashes with segmetation fault after 4-6 items if its set to "browser, simple" mode. This script uses ("browser", "webkit") webkit yes yes
50 1905.dk ready 1031 750 2019-Dec-07 02:16:25 00:07:17 [click] [click] @weekly 2019-10-09     -     -     - simple yes no
51 bikein.dk ready 906 871 2019-Dec-09 18:24:10 00:08:01 [click] [click] @weekly 2019-06-26     -     -     - simple yes no
52 ribecykellager.dk ready 881 712 2019-Dec-04 06:53:12 00:06:04 [click] [click] @weekly 2019-06-26     -     -     - simple yes no
53 jollyroom.dk ready 820 398 2019-Dec-07 02:30:17 00:04:09 [click] [click] @weekly 2019-10-31     -     -     - simple yes no
54 hellorider.dk ready 673 572 2019-Dec-07 08:42:31 00:09:23 [click] [click] @weekly 2019-10-30     -     -     - simple no no
55 cykelrabat.dk ready 664 462 2019-Dec-07 09:33:30 00:16:22 [click] [click] @weekly 2019-04-03     -     -     -     - yes no
56 12tri.dk ready 654 351 2019-Dec-03 21:10:05 00:02:57 [click] [click] 1 Record(s) from last mined in json file have item_price: 0 @weekly 2019-10-09     -     - (ch: could not find any stock infomartion (2019-10-09) simple yes no
57 prendas.co.uk ready 574 557 2019-Dec-07 08:39:46 00:12:37 [click] [click] @weekly 2019-09-12     -     -     - simple yes no
58 capitanibike.dk     - 556 270 2019-Oct-04 02:13:20 00:01:11 [click] [click] Missing crawler-script.php File
Last data imported was more then 30 days ago.
    -     -     -     -     -     -     - no
59 davincicykler.dk broken 543 530 2019-Jul-30 17:15:31 00:14:20 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
@weekly 2019-07-04     - "download of "https://davincicykler.dk/" is not allowed... I suspect they have blocked the crawler. (14/08/2019)     - simple no no
60 tempocykler.dk ready 538 515 2019-Dec-05 00:31:55 00:01:46 [click] [click] @weekly 2019-06-21     - Images is /cache/ but can't find a way to get the non cahced version.     - simple yes no
61 coolmtb.dk ready 527 0 2019-Dec-03 15:11:09 00:00:01 [click] [click] all records missing field: item_price @weekly 2019-03-12 Figure out a way to take the whole description Only takes the first paragraph of description.     -     - yes no
62 shopping.coop.dk ready 517 354 2019-Dec-07 16:17:44 00:01:36 [click] [click] @weekly 2019-06-25     -     - This script uses siteMapLocationsInclude contains on: 'cykel', 'loebeur', 'mtb'. simple yes yes
63 sport24outlet.dk ready 476 0 2019-Dec-07 20:21:09 00:00:01 [click] [click] all records missing field: item_price @weekly 2019-06-26     -     -     - simple yes no
64 all4cycling.com broken 450 0 2019-Aug-08 00:11:08 00:00:02 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
all records missing field: item_price
@weekly 2019-04-05 Fix the script, they have changed alot on their website     - Disabled     - yes yes
65 kjeldscykler.dk ready 422 421 2019-Dec-03 14:20:34 00:02:26 [click] [click] @weekly 2019-09-17     -     -     - simple no no
66 eurotoys.dk ready 414 414 2019-Dec-03 13:33:29 00:01:21 [click] [click] @weekly 2019-11-27     -     -     - simple no yes
67 musclehouse.dk ready 407 0 2019-Dec-08 07:52:15 00:01:06 [click] [click] all records missing field: item_price @weekly 2019-04-17     -     - We only take energy and diet suplements. simple yes yes
68 vestbyens-cykelhandel.dk ready 395 168 2019-Dec-08 15:31:15 00:01:06 [click] [click] 16 Record(s) from last mined in json file have item_price: 0 @weekly 2019-06-05     -     -     -     - yes no
69 aalborgbikezone.dk ready 376 369 2019-Dec-08 17:12:22 00:05:14 [click] [click] @weekly 2019-10-03     -     -     - simple yes no
70 harald-nyborg.dk ready 285 223 2019-Dec-08 11:08:31 00:01:23 [click] [click] @weekly 2019-03-12     -     - Might not take all cycleproducts from the site as it only includes urls with some sort of cycle word. This is to prevent all unrelated cycle products to be mined.     - yes no
71 velogear.dk ready 282 281 2019-Dec-09 02:10:18 00:03:10 [click] [click] @weekly 2019-10-07     -     -     - simple yes yes
72 gucca.dk broken 216 0 2019-Aug-13 09:07:16 00:00:06 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
all records missing field: item_price
@weekly 2019-08-13 Contact gucca to get special permission or use thier data feed from partner ads - "For yderligere information kan du kontakte kundeservice på e-mail kundeservice@gucca.dk eller Tlf: 70 27 85 05" "Vores system har registreret usædvanelig trafik fra din computer, bekræft venligst at du ikke er en robot" DISABLED - Can't mine from gucca anymore, also strange since they are on partner-ads. Most expect people to only use thier product feed. simple no yes
73 ventouxbike.dk ready 208 101 2019-Dec-09 08:02:02 00:01:54 [click] [click] @daily 2019-10-10     - (ch: Mines a random amount of products in webkit mode. But there isn't any products urls in simple browser mode 10/10/2019) (ch: set to daily as the script only takes a few minutes and it mines kinda randomly. 10/10/2019) webkit no no
74 lecol.cc ready 180 0 2019-Dec-08 16:36:05 00:08:20 [click] [click] all records missing field: item_price @weekly 2019-04-09     -     -     -     - yes yes
75 shop.ventumracing.com     - 147 143 2019-Dec-07 11:58:20 00:03:10 [click] [click] Missing crawler-script.php File
1 Record(s) from last mined in json file have item_price: 0
    -     -     -     -     -     -     - no
76 ahcykler.dk ready 130 42 2019-Dec-03 23:08:58 00:00:50 [click] [click] @weekly 2019-06-11     -     -     - simple yes no
77 pythonpro.com ready 119 119 2019-Dec-03 04:21:10 00:01:02 [click] [click] @weekly 2019-11-29     -     -     - simple yes no
78 figataciclismo.com ready 88 81 2019-Dec-09 14:25:36 00:01:28 [click] [click] @weekly 2019-12-04     -     -     -     - yes no
79 purepower.dk ready 80 68 2019-Dec-04 03:17:17 00:01:09 [click] [click] @weekly 2019-06-28     -     - Webkit with Search Results Match webkit no yes
80 cykel-lygter.dk broken 77 77 2019-Jul-03 11:10:35 00:00:17 [click] [click] status is sat to broken
Last data imported was more then 30 days ago.
@weekly 2019-07-04     - Script gets this error meassage on Jekins: "Download of https://www.cykel-lygter.dk failed: Socket operation timed out". The scrip works fine locally on my computer (ch)     - simple     - yes
81 billigsport24.dk ready 54 28 2019-Dec-03 17:32:30 00:01:22 [click] [click] @weekly 2019-04-03     -     -     -     - yes yes
82 shop.cykeloutlet.dk ready 43 0 2019-Dec-04 16:52:09 00:00:00 [click] [click] all records missing field: item_price @weekly 2019-07-03     -     -     - simple no no
83 urbanwinner.dk ready 39 28 2019-Dec-08 00:48:41 00:00:33 [click] [click] @weekly 2019-04-02     -     - Only have about 30 products     - yes yes
84 cykelbanditten.dk ready 37 29 2019-Dec-06 21:46:33 00:00:25 [click] [click] 3 Record(s) from last mined in json file have item_price: 0 @weekly 2019-03-05 suggestion: A check in the decimal setter could be made where if dkk is set and the price consists of only 4 digits. ("29.00") Then it should read it as 29, and not 2900. Prices from cykelbanditten in the 10 dititgs changes decimal standards from dk to gb. Fixed by taking the price at a different tag: "data-orginal-price", the suggestion could still help in furture scripts.     - yes no
85 billig-outlet.dk ready 26 26 2019-Dec-08 02:32:29 00:00:21 [click] [click] @weekly 2019-03-12     -     -     -     - yes no
86 joybuggy.dk ready 4 3 2019-Dec-06 04:40:11 00:00:03 [click] [click] Less then 5 products in DB @weekly 2019-08-12     -     - Doesn't have alot of products, but it ensures our categorie: misc_multimedia_component is not empty. simple no yes
# Script Name Script Status Products in Database Items Mined time scanning end local elapsed time scanning Shop Statistics Json link Warnings Cron Schedule Latest Change Known Issues Description Work Todo Webkit or Browser simple Sitemap Affiliate