Stop overriding default port configurations#40
Conversation
…h would end up in a OpenSSH error.
Stop overriding default port configurations
|
I'll push to npm soonish, but I'm trying to resolve an error/semantic problem (#39) before I do so. :) |
|
No worries and thanks for the great crawler :) Autodiscovery saved me a ton of time. One issue I'm still having however is when I try to discover 50+ links the state of the app ends up in a situation where always evaluates to true and the openRequests never timeout (even after setting the timeout option to something small like 5 or 10 seconds) and the app is just left spinning in cycles waiting on requests that never seem to end. I've tried to fix it for hours to no avail. As such, I was thinking of porting the Queue to use Async's built in queue implementation to handle making and receiving requests. Any thoughts on this? |
|
(BTW I totally forgot to tell you, but 0.2.7 is on npm now.:)) In regards to that issue, perhaps I need to be tighter with the timeouts. I'm pretty sure the problem does not relate to the queue, and I'd like it to retain a relatively general interface so that a queue can be implemented, for example, in Redis, and shared between multiple machines running the same crawl between then. I'll have a look into whether there's a simple way to fix this problem for you - that's most definitely 100% a bug. :) |
|
I looked into it more and I think the Queue is functioning properly, its just that some requests were taking 150+ seconds to respond. I also realized you aren't using the timeout anywhere so I added: and that fixed my problem right up once I set the timeout to 10 seconds. |
|
Sure, the missing timer is the bug. :) I've got a bit of stuff on right now but I'll see if I can get to this later today. Thanks! |

I was getting internal client errors because the Queue was being populated with port:80 queueItems but the protocol was listed as https. This caused a call to https.get({port:80}) which throws an OpenSSH error: