Marco Giordano
Marco Giordano

@GiordMarco96

26 Tweets 18 reads Sep 20, 2022
A list of the most useful #Python libraries you can use for #SEO right now. ๐Ÿ
This updated thread will tell you the main libraries for #DataScience and #NLP that you should consider.
Use them in your workflow! ๐Ÿงต
Numpy & Pandas: the foundations for data analysis, just learn them.
Without these 2 libraries, you cannot do Data Science at all. Good knowledge of Pandas can get you quite far.
Advertools: the best SEM library out there.
Itโ€™s very useful for crawling, log file analysis, analyzing SERPs and querying the Knowledge Graph.
The ideal Swiss knife you need in your arsenal.
advertools.readthedocs.io
Ecommercetools: The ideal package for analyzing eCommerce data and getting access to some useful NLP functions.
Itโ€™s a rare jewel in your collection that is very handy for technical SEO and e-commerce as well.
pypi.org
Requests: Make HTTPS requests via Python, essential for web scraping.
Sure, there are alternatives but you should learn them. It's very important and a lot of your initial work will require this library.
pypi.org
urllibb: for working with URLs. It should be part of your arsenal.
Take some time to study all the options and possible use cases.
docs.python.org
BeautifulSoup: a library to extract data from HTML/XML files, used in combination with scraping libraries to convert data into Python objects.
One of the first ones youโ€™ll probably learn in your Python journey.
crummy.com
Scrapy: the absolute peak of scraping.
Nothing is better than this, even though the setup may be hard.
You can carry out any scraping task with this library.
Matplotlib/Seaborn/Plotly: you need some sort of visualization and these libs are here to help you.
You can start with Seaborn which is easier to use. DataViz is an important topic and you should value it.
NLTK/spaCy: work with human language to analyze text data and get insights into the nuances of our language.
This is necessary to get your hands dirty with text data.
The latter can be used to recognize entities and parts of speech.
Querycat: few functions but good quality thanks to association rule mining and BERT.
It's one of my favorite libraries, but the installation may not be immediate.
It's useful for visualizing losses in impressions over time.
github.com
Transformers: Pretrained models to handle a wide range of tasks. Essential for NLP!
This library is crucial for the most advanced tasks and quite reliable too. I highly suggest you check my other thread:
sentence_transformers: Python framework for state-of-the-art sentence, text, and image embeddings.
Use it for keyword clustering and other text-related tasks. It's one of my most used libraries right now.
sbert.net
Streamlit/Dash: interactive web applications.
Useful for prototyping and communicating.
Streamlit is one of the most favorite solutions for the SEO community.
Typer: create apps that you can run from your command line.
Extremely powerful for personal uses and for running local scripts.
A game-changer for automating your workflow.
typer.tiangolo.com
networkx: the must-have graph theory library.
I recommend you learn it once you have mastered the basics.
Graph Theory is of great importance for analysts who want to level up their game.
More on this in future threads.
networkx.org
BERTopic: one of my most used NLP libraries and for good reasons. I dedicated an entire thread on the topic:
scattertext: library for finding distinguishing terms in corpora and displaying them in an interactive HTML scatter plot.
A short example from the official docs: (#citation" target="_blank" rel="noopener" onclick="event.stopPropagation()">github.com).
openpyxl: if you have to work with Excel data and create spreadsheets.
There are other libraries but I prefer to use this one. It's quite nice and it works well for most of the tasks.
openpyxl.readthedocs.io
Start with scraping and data analysis.
Then, you can move to NLP libraries and study topics like NER and Clustering.
Sticking to the mainstream libraries is necessary to get access to "better" documentation.
My suggestion is to try alternatives and always look for new opportunities across the web.
Be sure to always do your research, you could find the perfect library for your needs.
Follow me for threads, tips, and case studies (coming soon) about SEO, content, and Python/data.
If you liked this thread, consider liking and retweeting it!๐Ÿงต
I offer short consultancies and full freelancing for publishers and B2C content.
bookk.me

Loading suggestions...