• You are not logged in.

#1 March 2, 2013 11:44:32

sketman
Registered: 2013-01-22
Posts: 13
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

Hello,

could you please have a look on the following page?
http://nakoz.org/profile.php?mode=viewprofile&u=11
You will see there a small icon “email”.
When you go ever the icon with your mouse, you will see the email address. Also when you click on it, the new e-mail message window will open in your default e-mail program.

Hpwever I am not able to scrape this widget. When setting up extraction type, I have tryed all possibilities, but nothing work. The e-maill address can not be scraped out.

I have see this kind of element on more sites. For example: http://www.musicscaper.sk/?action=katalog_detail&id=531&ak=1. You can see there the Email: field with the “klikne pre email” text. this seems to me as the system as site above.

Could you please recommend how to get the information (email address) from this kind of element?

Thank you very much.

Offline

#2 March 3, 2013 07:02:20

admin
Registered: 2012-03-15
Posts: 289
Reputation: +  1  -
Profile   Send e-mail  

Scraping this element

For the first page http://nakoz.org/profile.php?mode=viewprofile&u=11, just set “extract type” to “dom attribute” with “href”, you can scrape email, I've made a demo to do this, please see the attachment.

But for the page http://www.musicscaper.sk/?action=katalog_detail&id=531&ak=1. It protect the email address on the page, just when “mouseover” it will show the true email with javascript. To scrape these kind of pages, you can add a “runjs” action before “scrape” it, and write some js code to emit “mouseover” event, then scrape the the email as the way of the first page. It's work an easy work to do this, you need know how to write javascript code.

Attachments:
attachment nakoz.fmp (108.0 KB)

Offline

#3 March 4, 2013 12:15:40

sketman
Registered: 2013-01-22
Posts: 13
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

Hello,
thank you very much for your help.

the first page - Nakoz - I knew about that settings, I tryed it, but I could not see the output (e-mail address) in the preview window (Selection->Results). So I thought that setting Extract type to href dont work in this case. I made some mistake, probably the page was not loaded correctly or something else prevented me from viewing the result correctly.

The second page - I dont know java unfortunately, so I will have to find some other solution.

Thank you again for help.

Offline

#4 Nov. 21, 2013 09:07:34

voipnick
Registered: 2013-11-19
Posts: 6
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

Hi,
Wondering how to extract all pages from this site? The next page is an icon showing blank.gif.
I've tried the open links recursively with most options I could see but it only repeats the first page twice.
Any suggestions?

http://www.ticketmaster.ca/search?tm_link=tm_header_search&user_input=las+vegas+zarkana&q=las+vegas+zarkana

Thanks,
Nick

Offline

#5 Nov. 22, 2013 01:58:44

admin
Registered: 2012-03-15
Posts: 289
Reputation: +  1  -
Profile   Send e-mail  

Scraping this element

Yes, this image's position is often change on page, you can do like this: first click “expend” to change the xpath like this: //a, then click “change target xpath” ->“with text ”go to next page“”.

And this page use ajax to change content, so you should not use “openlink”, just use “click” and loop to scrape the page. See attachment, I made a simple demo to scrape all name and date on pages.

Attachments:
attachment 1.fmpx (24.1 KB)

Offline

#6 Nov. 22, 2013 16:05:19

voipnick
Registered: 2013-11-19
Posts: 6
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

admin
Yes, this image's position is often change on page, you can do like this: first click “expend” to change the xpath like this: //a, then click “change target xpath” ->“with text ”go to next page“”.

And this page use ajax to change content, so you should not use “openlink”, just use “click” and loop to scrape the page. See attachment, I made a simple demo to scrape all name and date on pages.

Excellent! Thanks as I'm new to this tool. The example helped me the most.

Offline

#7 Nov. 26, 2013 20:51:49

voipnick
Registered: 2013-11-19
Posts: 6
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

Hi Again,
I have another strange thing happening.
This FMiner project is supposed to extract both extended and non extended reviews.
What it does is skip non extended reviews if they occur in the beginning of the page up onto the point it reaches a “more” (extended review).
I've been solving this a crude way by using two FMiner projects (one with extended, one wihtout extended reviews) & between them filling in the blanks via a database; I would prefer to have it all extract properly in one project.

Is this possible?
Thanks,
Nick

Attachments:
attachment Zar_1b_with_more_bug.fmpx (43.2 KB)

Offline

#8 Nov. 27, 2013 03:04:50

admin
Registered: 2012-03-15
Posts: 289
Reputation: +  1  -
Profile   Send e-mail  

Scraping this element

I've checked the project, FMiner can add more than one “scrape page” node from the same node, then you can add another “scrape page” action from the “wait” action to scrape just the “top non extended reviews”, and select “save to table” to the same table, then it will scrape all reviews and save to the same table.

Offline

#9 Nov. 27, 2013 11:41:49

voipnick
Registered: 2013-11-19
Posts: 6
Reputation: +  0  -
Profile   Send e-mail  

Scraping this element

admin
I've checked the project, FMiner can add more than one “scrape page” node from the same node, then you can add another “scrape page” action from the “wait” action to scrape just the “top non extended reviews”, and select “save to table” to the same table, then it will scrape all reviews and save to the same table.

Hi again,
I tried many different ways and I can't get the extended reviews to pull correctly.
Am I missing something?
I attached the last project that should work? I even added a second review field in the same table as suggested so to distinguish the extended one & it doesn't work.

Thanks for all your help
Nick

Attachments:
attachment TripA_Zar_all_ok_dupes.fmpx (81.4 KB)

Offline

#10 Nov. 27, 2013 19:08:05

admin
Registered: 2012-03-15
Posts: 289
Reputation: +  1  -
Profile   Send e-mail  

Scraping this element

The project you made have some mess, not check it in detail, maybe you did not clear what I means, I means you can add another “scrape page” on the last “wait” action to just scrape the review you can't scraped in the first “scrape page” action, Made the project for you, try it.

Attachments:
attachment Zar_1b_with_more_bug (1).fmpx (79.8 KB)

Offline

Board footer

Moderator control

Powered by DjangoBB