Challenge - Week One: (12/29 to 1/5)
Create a web crawler that scrapes and indexes our homepage (https://icodestuff.io) for all anchor tags using the language of your choice, though PHP is preferred. The data will need the href and the anchor text.
- Using PHP/cURL or the language of your choice, you will need to get the html contents of the page (Guzzle is an alternative)
- Using native PHP you will want to loop through the results create an array with the base URL as the root like so: [https://icodestuff.io] => [CONTENT]
- After looping through the content and creating an array you will want to convert it to json.
- Once the data is in json you will want to write the contents into a file and have it saved in a file named content.json.
- After all the prior steps have been completed and everything works properly you will want to email email@example.com with your code, if in a language other than PHP you will also want to send instructions of how to run your code.
- PHP cURL/Guzzle HTTP: Will retrieve the contents of our homepage via curl
- DomDocument: You will be able to select specific HTML elements
- Xampp: Launch a server on your local machine so you can run PHP files
- Content: The json needs to be identical to ours
- Efficiency: Make sure your code is as efficient as possible
- Integrity: We don't tolerate cheaters
- Simplicity: Don't overcomplicate things, keep things as simple as possible
There will be 2 winners announced a few days after the deadline. The winners will receive a shoutout on social media and will win a free shirt or mug. Remember all submissions must be submitted to firstname.lastname@example.org for your chance to win. Good luck!
In this demo I will be using https://www.youtube.com as our base URI and I will be running the code is both the command line and within the browser using PHP as my language.