letras.top
a b c d e f g h i j k l m n o p q r s t u v w x y z 0 1 2 3 4 5 6 7 8 9 #

letra de data mining on tennis racquets - chris ares & prototyp nds

Loading...

data mining on tennis racquets

introduction of methodology

introduction to the information
data collection is the first step in every data science project. though data archives abound, it’s more convenient to use the internet as a database. this is typically a software engineer’s career, but you might find yourself doing it in a smaller company. let’s get started!
2. information
since i play tennis, the data comes from tennis-xpress.com. there are 213 instances and 15 features in this small dataset
i always offer credit where credit is due to sources that i find useful, and this time is no exception: you can see it here: https://www.youtube.com/watch?v=mebu-4xs2ru. if i wasn’t clear, feel free to watch the video
we’re essentially creating three functions: request(), p-rs-(), and output() (). the aim of these functions is self-explanatory: bind, read, and save, to name a few
we’ll use the “requests html” library, which can be found at https://pypi.org/project/requests-html/
. one of the biggest advantages of using this library is that it can render dynamically loaded content, which can be found on almost any modern webpage. you may be wondering what dynamically loaded contents are. the client, not the server, is the one who displays the details. so, what’s the point? the short answer is that you may not be able to locate the information you need. the long response is that when i use the library “beatifulsoup” to request data from the server, it won’t return data rendered by javascript or ajax (aka dynamically loaded contents) because i’m requesting data from the server, which doesn’t render dynamically loaded contents, but the client does. don’t panic; first, see if the data you need is dynamically loaded by inspecting the element in your browser for a script tag that matches javascript or ajax. i’m sorry, but your data is dynamically loaded if you see the highlighted script tag inside a larger tag that also wraps the information you want. the good news is that dynamically loaded content is used on nearly 90% of modern websites, so the expertise is in high demand

there are a variety of solutions, including using selenium or even writing a new class to imitate a client. they are, however, often overly complicated since they were not designed for this reason. fortunately, the “requests html” library is available. let’s take a look at one of its features: request ()

the request() function’s primary goal is to render dynamically loaded content and return the target data. but how do you do it? ignore ‘productlist’ for the time being; i’ll ill-strate it in the p-rs-() function later. to make the website, we must first collect all of the data using the get() method. the url that was passed in as a parameter is important because it allows us to reuse our code by passing in a different url. we have our main character, html.render(), beneath the get() function, which functions as a client to render all the dynamically loaded contents. after that, i used xpath to find the section i wanted (right cl!ck, select copy, select xpath). this is, in my opinion, the simplest method, but it is by no means the only one. other options can be found on the library’s website

before i go any further, i need to quickly clarify how the website is organised and what my strategy is to ensure that everything makes sense. the information i need is contained on a separate page from the racquet index page, and it pertains to the specifications of each individual racquet. as a result, i must first obtain all of the url links for each racquet from the racquet index list, as we did in the request() function. then i need to submit the index page to the p-rs- function, where i p-rs- each individual racquet page one by one to collect their individual specs on their own page, which leads me to the next function: p-rs-()

for more amazing articles, please visit myarticles

letras aleatórias

MAIS ACESSADOS

Loading...