Data dredging meaing1/4/2024 ![]() ![]() freely available data that you did not collect yourself. Note: The rest of this article will assume that you are developing a problem statement using found data - i.e. But if you hope to use an open dataset to solve a problem or make decisions, you’ll want to start your data project with a clearly defined problem statement instead. using appropriate statistical methods and a large enough sample size - this type of data exploration can be useful for generating hypotheses or reporting on the “state of affairs” in the domain of your data. While the phrase often has a negative connotation, when done correctly - i.e. In many data and science spaces, this type of headfirst-diving into analysis without a focused hypothesis driving the exploration is referred to as a “fishing expedition” or “ data dredging." ![]() That is, analyzing random questions as they popped into my head, always on the hunt for a result that was interesting. I would eventually find a dataset that piqued my interest and start “poking” at it. Not only were there seemingly limitless datasets to choose from, the datasets themselves were often large and unexplored. When I was first learning how to work with open data, I would often get lost in the possibility of it all. But given the sheer volume of data available, where do you start? Whatever your focus, with so much data at your fingertips, it can be tempting to utilize these resources to improve your data processing skills, uncover new information, join a data competition, and solve real world problems. There are literally millions of datasets on the internet that are open and freely available to use. ![]()
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |