Open Link(s) on page

If you want program open link(s) on the page, you can add it. It can be add by right click when recording action.

1. Target

See select target.

2. Links type

It show what type link(s) the program will open.

  • Href: Selected link DOM's(or DOMs') href attribute.All links in target: 
  • All links in selected DOM(s).
  • Generate URLs: Create URLs with the scraped data. For example: you can set "XPath", and "extract type" to scrape all values of INPUT controls on page, then set "Adjust data" with javascript code:
'http://www.xxxxx.com/id=' + data;

You can generate some special URLs.

 

(NOTE: It can be more than one URLs, at this time FMiner will open them with multiple browsers to get good performance, and you can change browser count.)


Open link(s) recursively

When checked this button, the program will open the link(s) recursively, it often be used for "next page" link.

 

This is magic option, it's often be used for "next" pages. When enable it, FMiner will continuously open these links on pages. You can add other actions follow it, FMiner will do these actions one time before do it. This is to avoid missing the actions on first page. For example: a openlink(s) action A to open "next" link, and a openlink(s) action B following it to open items links, FMiner will execute action B one time before A, so it will not miss the links on the first page.


Max recursive level

0 means unlimited, you can change it when you just want to open a certain number of pages. To iterate through only 7, 27 or say 77 pages you would set the recursive level figure accordingly.

 

URL filter

It can be used to filter the URLs you select, and discard the links do not comply with the rules. For example, you can change the target to "//a", this will select all the links on page, and then set the URLs filter to allow the certain links.

1. Domain filter

The Domain Filter feature may be used to ban the links from the page leading to a different domain.

 

2. URLs pattern

Here you can set some patterns to allow or ban some URLs. pattern format is like "*keyword*", * matches all characters.

 

3. Filter in Data Table

Ban or Allow URLs in a table. It's useful for incremental data extracts. You can save the page's URLs to a table when capture data, then set the filter to ban it, when you run extractor again, program will just scrape the new links not in the column of the table.