Why Python?

When we think about exploring a new programming language or technology, we often wonder about the scope of the new technology and how it might benefit us. Let's start this chapter by thinking about why we might want to use Python and what advantages it might give us.

To answer this question, we are going to think about current technology trends and not get into more language-specific features, such as the fact that it is object-oriented, functional, portable, and interpreted. We have heard these terms before. Let's try to think about why we might use Python from a strictly industrial standpoint, what the present and future landscapes of this language might look like, and how the language can serve us. We'll start by mentioning a few career options that someone involved in computer science might opt for:

  • Programmer or software developer
  • Web developer
  • Database engineer
  • Cyber security professional (penetration tester, incident responder, SOC analyst, malware analyst, security researcher, and so on)
  • Data scientist
  • Network engineer

There are many other roles as well, but we'll just focus on the most generic options for the time being to see how Python fits into them. Let's start off with the role of programmer or software developer. As of 2018, Python was recorded as the second most popular language listed in job adverts (https://www.codingdojo.com/blog/7-most-in-demand-programming-languages-of-2018/). The role of programmer might vary from company to company, but as a Python programmer, you might be making a software product written in Python, developing a cyber security tool written in Python (there are tons of these already in existence that can be found on GitHub and elsewhere in the cyber security community), prototyping a robot that can mimic humans, engineering a smart home automation product or utility, and so on. The scope of Python covers every dimension of software development, from typical software applications to robust hardware products. The reason for this is the ease of the language to understand, the power of the language in terms of its excellent library support, which is backed by a huge community, and, of course, the beauty of it being open source.

Let's move on to the web. In recent years, Python has done remarkably well in terms of its maturity as a web development language. The most popular full stack web-based frameworks such as Django, Flask, and CherryPy have made web development with Python a seamless and clean experience, with lots of learning, customization, and flexibility on the way. My personal favorite is Django, as it provides a very clean MVC architecture, where business, logic, and presentation layers are completely isolated, making the development code much cleaner and easier to manage. With all batteries loaded and support for ORM and out-the-box support for background task processing with celery, Django does everything that any other web framework would be capable of doing, while keeping the native code in Python. Flask and CherryPy are also excellent choices for web development and come with lots of control over the data flow and customization.

Cyber security is a field that would be incomplete without Python. Every industry within the cyber security domain is related to Python in one way or another and the majority of cyber security tools are written in Python. From penetration testing to monitoring security operations centers, Python is widely used and needed. Python aids penetration testers by providing them with excellent tools and automation support with which they can write quick and powerful scripts for a variety of penetration testing activities, from reconnaissance to exploitation. We will learn about this in great detail throughout the course of this book.

Machine learning (ML) and artificial intelligence (AI) are buzz words in the tech industry that we come across frequently nowadays. Python has excellent support for all ML and AI models. Python, by default in most cases, is the first choice for anyone who wants to learn ML and AI. The other famous language in this domain is R, but because of Python's excellent coverage across all the other technology and software development stacks, it is easier to combine machine learning solutions written in Python with existing or new products than it is to combine solutions written in R. Python has got amazing machine learning libraries and APIs such as scikit-learn, NumPy, Pandas, matplotlib, NLTK, and TensorFlow. Pandas and NumPy have made scientific computations a very easy task, giving users the flexibility to process huge datasets in memory with an excellent layer of abstraction, which allows developers and programmers to forget about the background details and get the job done neatly and efficiently.

A few years ago, a typical database engineer would have been expected to know relational databases such as MySQL, SQL Server, Oracle, PostgreSQL, and so on. Over the past few years, however, the technology landscape has completely changed. While a typical database engineer is still supposed to know and be proficient with this database technology stack, this is no longer enough. With the increasing volume of data, as we enter the era of big data, traditional databases have to work in conjunction with big data solutions such as Hadoop or Spark. Having said that, the role of the database engineer has evolved to be one that includes the skill set of a data analyst. Now, data is not to be fetched and processed from local database servers—it is to be collected from heterogeneous sources, pre-processed, processed across a distributed cluster or parallel cores, and then stored back across the distributed cluster of nodes. What we are talking about here is big data analytics and distributed computing. We mentioned the word Hadoop previously. If you are not familiar with it, Hadoop is an engine that is capable of processing huge files by spawning chunks of files across a cluster of computers and then performing an aggregation on the processed result set, something which is popularly known as a map-reduce operation. Apache Spark is a new buzzword in the domain of analytics and it claims to be 100 times faster than the Hadoop ecosystem. Apache Spark has got a Python API for Python developers called pyspark, using which we can run Apache Spark with native Python code. It is extremely powerful and having familiarity with Python makes the setup easy and seamless.

The objective of mentioning the preceding points was to highlight the significance of Python in the current technological landscape and in the coming future. ML and AI are likely to be the dominating industries, both of which are primarily powered by Python. For this reason, there will not be a better time to start reading about and exploring Python and cyber security with machine learning than now. Let's start our journey into Python by looking at a few basics.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.19.27.178