My go-to tool for data collection is the SelectorLib library. It is an easy to use, quick alternative to setting up a scraping solution from scratch. There are many ways to implement the library, and I will share my workflow. I encourage anyone interested to also take a look at the documentation on the website, because they do a good job of providing tutorials and guides that spell things out clearly.
In order to use this module you need to download the python package, and download the chrome extension.
pip install selectorlib
You can think of the process of using…
As you probably know, Tor is an anonymity network which utilizes a unique system of Onion Routing (explained later) to keep users of the network anonymous. In this article, i’m going to touch on some of the fundamental principals used by Tor, and how you might find it useful. Started in the mid-1990’s at the U.S Naval Research Laboratory, the project was developed to foster secure communications between spies and other government agents involved in covert investigations. In 2002 the software was released under a free public-use license. Control was handed to the Electronic Frontier Foundation, who in turn handed…
When I was first introduced to the command line on my computer, it seemed confusing, and low-tech. It brought me back to my days as a teenager, punching codes into an old-school POS system, or checking inventory in an outdated company intranet. Though these technologies looked dated, they were functional. As I became more comfortable with the terminal I quickly realized that it was far more than functional, it exposed a host of small programs that you could use to interact with the computer, solve problems, and even chain together in complex ways. …
Often neglected in the implementations of the most popular machine learning and statistical analysis frameworks is survival analysis. Simply, survival analysis is the time it takes for an event of interest to occur. Although that seems pretty straight forward, the reality is a little more complicated. In this article, we will go through some of the high level concepts necessary to understand when conducting survival analysis, or deciding if it is the right tool for your problem.
In part 1 of this series, I went over what PostgreSQL is, how it works, how you can get started adding databases and tables to create a relational database for storing your information. In this article i’m going to assume that you have everything downloaded, and databases/tables to work with. I’m going to go through a few operations that I used to find myself constantly googling. If you find this information useful it would be a good addition to your bookmarks. Let’s begin!
Let’s say you have a CSV file that you want to import into an postgres database using…
PostgreSQL is an open-source Relational Database Management System (RDMS) thats popular for a number of reasons: It’s free, it’s secure, it supports custom functions, it’s object relational model architecture, and unlimited rows per table. Check out this article for a more in-depth breakdown. PostgreSQL is also used by many major companies including: NASA, Twitch, Apple, and Reddit. In this article we are going touch on the basics of PostgreSQL so you can get up and running fast.
On a Mac, the process of downloading postgres is simplified thanks to the postgress.app installation package. …
Managing the flow of data through a website or app is a crucial skill to master if you plan on making any sort of modern web service. With flask, Object Relational Mappers (ORM’s) are employed to allow your app to interact with a relational database. An Object Relational Mapper is a framework that, in our case, will allow us to interact with a SQL database using python instead of explicit SQL queries. The name of the ORM we are using is SQLAlchemy and can be downloaded as follows:
pip install flask_sqlalchemy
pip install psycopg2-binary #for using postgres
In this article, we are going to implement a pre-trained TensorFlow face mask detection model originally developed by Hussain Mujtaba. Some of the code and TensorFlow model training information can be found in his article here.
To begin, let’s go through some of the basics of OpenCV.
First, make a new directory for the project files. Inside of the directory, let’s make a virtual environment to download the necessary packages. If you do not have virtualenv you should run the first line of code, otherwise, skip the first line.
python3 -m pip install --user -U virtualenvpython3 -m virtualenv your_env
Pattern matching and text manipulation from the terminal.
Sed is an early UNIX program meant to function as a non-interactive ‘stream editor’. It is one of the earliest programs to support the use of regular expressions for pattern matching, and has remained a popular for editing and filtering streams of text from the command line. In this article, I’m going to go through some of the basics, and provide some examples of what can be accomplished with basic sed programs.
A brief demonstration using python and pyautogui.
In the late-2000’s, I spent more time than I care to admit playing Halo online. During this time I was exposed to (but did not participate in) the small but unavoidable modding community within the game. People who had found ways to cheat, or alter the game to either win or sometimes just to implement fun new features. One of the techniques used was “Standby” cheating, which was when software was used to block the internet connections of the players on the opposing team, send large amounts of information to their connections, and…
Turning over rocks and seeing what crawls out.