This content originally appeared on DEV Community and was authored by Midhun
I had to get a list of all links on a webpage for a task I was working on. here I am sharing the snippet of code that I used. Let's discuss how to improve it
var tag = document.querySelectorAll("a");
var myarray = []
for (var i = 0; i < tag.length; i++) {
var nametext = tag[i].textContent;
var cleantext = nametext.replace(/\s+/g, ' ').trim();
var cleanlink = tag[i].href;
myarray.push([cleantext, cleanlink]);
};
function generateJson() {
var hrefArray = [];
for (var i = 0; i < myarray.length; i++) {
let t = {}
t.n = myarray[i][0]; t.m = myarray[i][1];
hrefArray.push(t);
};
var win = window.open("Json");
win.document.write(JSON.stringify(hrefArray));
}
generateJson()
Steps
- You will need to open the website in your browser to get all links
- Go to the console tab in Inspect element
- Please paste the above code and press enter. A json file will open in a new window
Screenshots
- How to Run
- Result
Please let me know your thoughts after reading
This content originally appeared on DEV Community and was authored by Midhun
Midhun | Sciencx (2022-01-17T15:10:35+00:00) Simple web scraper that reads all the links to JSON files in JS. Retrieved from https://www.scien.cx/2022/01/17/simple-web-scraper-that-reads-all-the-links-to-json-files-in-js/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.