Skip to content

Comments

User data Functionality#36

Open
ueg1990 wants to merge 11 commits intokaran:masterfrom
ueg1990:UserData
Open

User data Functionality#36
ueg1990 wants to merge 11 commits intokaran:masterfrom
ueg1990:UserData

Conversation

@ueg1990
Copy link
Contributor

@ueg1990 ueg1990 commented Mar 19, 2014

Added User data functionality to retrieve user data. All the info is extracted except for comments. Could not finish it because getting the following message when testing:

We've limited requests for this url.

If you know why this is happening, please do let me know. Till then, u can review my code. Unittests are still incomplete.

Update: Travis CI build failed because of above issue

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 19, 2014

I am not sure why the other 4 commits are showing, u can remove them or merge them ( i think they were fixed). Main part of pull request is 51ac0f8

@ueg1990 ueg1990 closed this Mar 19, 2014
@ueg1990 ueg1990 reopened this Mar 19, 2014
@karan
Copy link
Owner

karan commented Mar 20, 2014

I'll go over this and fix it later. Thanks for the PR though.

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 23, 2014

hey karan, so the funny thing is that the all the tests passed except for the 'best' url because of the links that get expired. get_user function worked excepted for the comments parts as that still has to be implemented. i needed ur advice/suggestions for two things:

  1. can u help me figure out how to get pass the 'We've limited requests for this url.' issue. if that can be fixed then i can quickly fix the issue
  2. is there a way to run tests on Travis without doing a pull request. everytime i want to run tests on Travis i do a pull request. if that is ok with u, then ill continue to do this way. i just dont want it to be irritating that i make small changes everytime to a pull request just to run it on Travis.

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 24, 2014

Hey karan, do u think it is relevant to get all the comments submitted by a user?? the reason i am asking is because alot of the code has to be re-written to get comments submitted by a user even though the code is similar to the one in _build_comments under the Story class.

@karan
Copy link
Owner

karan commented Mar 25, 2014

Yeah it might be a bad idea to get user comments seeing how HN does not want us to crawl those pages.

1) can u help me figure out how to get pass the 'We've limited requests for this url.' issue. if that can be fixed then i can quickly fix the issue

The best way to fix this is to run tests/queries with some interval between them. Do not run all tests at once, but instead run one module at a time.

2) is there a way to run tests on Travis without doing a pull request. everytime i want to run tests on Travis i do a pull request. if that is ok with u, then ill continue to do this way. i just dont want it to be irritating that i make small changes everytime to a pull request just to run it on Travis.

As far as I know, no. You will have to setup travis for your own fork.

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 25, 2014

Thanks for the reply. So if ur ok with not extracting comments submitted by a user, ill update this pull request so that all the other info is crawled. i know the tests pass and then u merge it.

ueg1990 added 3 commits March 25, 2014 12:40
…es not allow crawling of comments submitted by user
…at we never get expired links by clicking the More link
@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 25, 2014

So weird....my tests pass locally but then fail on travis. Does travis use my ip address when running the tests?? if yes then im assuming HackerNews is blocking my ip address.

@karan
Copy link
Owner

karan commented Mar 26, 2014

No I travis does not use your IP. And the failure is not because of HN. See the trace:

======================================================================
ERROR: test_get_user (tests.test_get_user.TestGetUser)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/karan/HackerNewsAPI/tests/test_get_user.py", line 15, in test_get_user
    test_user = self.hn.get_user('karangoeluw')
  File "/home/travis/build/karan/HackerNewsAPI/hn/hn.py", line 198, in get_user
    table = soup.find_all('table')[2].find_all('tr')
IndexError: list index out of range

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 26, 2014

The failure is because of the “Limited request for URL” error. Its what the BeautifulSoup function returns and hence there are no table attributes.

On Mar 25, 2014, at 10:07 PM, Karan Goel notifications@github.com wrote:

No I travis does not use your IP. And the failure is not because of HN. See the trace:

ERROR: test_get_user (tests.test_get_user.TestGetUser)

Traceback (most recent call last):
File "/home/travis/build/karan/HackerNewsAPI/tests/test_get_user.py", line 15, in test_get_user
test_user = self.hn.get_user('karangoeluw')
File "/home/travis/build/karan/HackerNewsAPI/hn/hn.py", line 198, in get_user
table = soup.find_all('table')[2].find_all('tr')
IndexError: list index out of range

Reply to this email directly or view it on GitHub.

@karan
Copy link
Owner

karan commented Mar 26, 2014

So now there's two choices:

  • We either put a sleep interval in each test, but seeing the number of tests we have (and will have), this might be a bad choice.
  • We mock the http responses.

Unit testing on a live site is a bad idea, so I'll do some reading, add some mocks, and then come back to this PR. How's that sound?

@ueg1990
Copy link
Contributor Author

ueg1990 commented Mar 26, 2014

Sure. Sorry i could not find a solution to this.

What do you mean by mock http responses? I can also have a look at how to do this.

On Mar 25, 2014, at 10:14 PM, Karan Goel notifications@github.com wrote:

So now there's two choices:

We either put a sleep interval in each test, but seeing the number of tests we have (and will have), this might be a bad choice.
We mock the http responses.
Unit testing on a live site is a bad idea, so I'll do some reading, add some mocks, and then come back to this PR. How's that sound?


Reply to this email directly or view it on GitHub.

@karan
Copy link
Owner

karan commented Mar 26, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants