It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Data Sources provided by UK Libraries: Data Analysis
A guide to help you get started in your pursuit of datasets. UK Libraries provide many data sources across many disciplines.
Data management -- Library Resources (starter list)
Data Management for Researchers by Kristin BrineyA comprehensive guide to everything scientists need to know about data management, this book is essential for researchers who need to learn how to organize, document and take care of their own data. Researchers in all disciplines are faced with the challenge of managing the growing amounts of digital data that are the foundation of their research. Kristin Briney offers practical advice and clearly explains policies and principles, in an accessible and in-depth text that will allow researchers to understand and achieve the goal of better research data management. Data Management for Researchers includes sections on: * The data problem - an introduction to the growing importance and challenges of using digital data in research. Covers both the inherent problems with managing digital information, as well as how the research landscape is changing to give more value to research datasets and code. * The data lifecycle - a framework for data's place within the research process and how data's role is changing. Greater emphasis on data sharing and data reuse will not only change the way we conduct research but also how we manage research data. * Planning for data management - covers the many aspects of data management and how to put them together in a data management plan. This section also includes sample data management plans. * Documenting your data - an often overlooked part of the data management process, but one that is critical to good management; data without documentation are frequently unusable. * Organizing your data - explains how to keep your data in order using organizational systems and file naming conventions. This section also covers using a database to organize and analyze content. * Improving data analysis - covers managing information through the analysis process. This section starts by comparing the management of raw and analyzed data and then describes ways to make analysis easier, such as spreadsheet best practices. It also examines practices for research code, including version control systems. * Managing secure and private data - many researchers are dealing with data that require extra security. This section outlines what data falls into this category and some of the policies that apply, before addressing the best practices for keeping data secure. * Short-term storage - deals with the practical matters of storage and backup and covers the many options available. This section also goes through the best practices to insure that data are not lost. * Preserving and archiving your data - digital data can have a long life if properly cared for. This section covers managing data in the long term including choosing good file formats and media, as well as determining who will manage the data after the end of the project. * Sharing/publishing your data - addresses how to make data sharing across research groups easier, as well as how and why to publicly share data. This section covers intellectual property and licenses for datasets, before ending with the altmetrics that measure the impact of publicly shared data. * Reusing data - as more data are shared, it becomes possible to use outside data in your research. This chapter discusses strategies for finding datasets and lays out how to cite data once you have found it. This book is designed for active scientific researchers but it is useful for anyone who wants to get more from their data: academics, educators, professionals or anyone who teaches data management, sharing and preservation. "An excellent practical treatise on the art and practice of data management, this book is essential to any researcher, regardless of subject or discipline." --Robert Buntrock, Chemical Information Bulletin
Call Number: Education Library. Book Stacks Q180.55.E4 B75 2015
Publication Date: 2015
Research Data Management by Joyce M. Ray (Editor)It has become increasingly accepted that important digital data must be retained and shared in order to preserve and promote knowledge, advance research in and across all disciplines of scholarly endeavor, and maximize the return on investment of public funds. To meet this challenge, colleges and universities are adding data services to existing infrastructures by drawing on the expertise of information professionals who are already involved in the acquisition, management and preservation of data in their daily jobs. Data services include planning and implementing good data management practices, thereby increasing researchers' ability to compete for grant funding and ensuring that data collections with continuing value are preserved for reuse. This volume provides a framework to guide information professionals in academic libraries, presses, and data centers through the process of managing research data from the planning stages through the life of a grant project and beyond. It illustrates principles of good practice with use-case examples and illuminates promising data service models through case studies of innovative, successful projects and collaborations.
Beyond Basic Statistics by Kristin H. JarmanFeatures basic statistical concepts as a tool for thinking critically, wading through large quantities of information, and answering practical, everyday questions Written in an engaging and inviting manner, Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know presents the more subjective side of statistics--the art of data analytics. Each chapter explores a different question using fun, common sense examples that illustrate the concepts, methods, and applications of statistical techniques. Without going into the specifics of theorems, propositions, or formulas, the book effectively demonstrates statistics as a useful problem-solving tool. In addition, the author demonstrates how statistics is a tool for thinking critically, wading through large volumes of information, and answering life's important questions. Beyond Basic Statistics: Tips, Tricks, and Techniques Every Data Analyst Should Know also features: Plentiful examples throughout aimed to strengthen readers' understanding of the statistical concepts and methods A step-by-step approach to elementary statistical topics such as sampling, hypothesis tests, outlier detection, normality tests, robust statistics, and multiple regression A case study in each chapter that illustrates the use of the presented techniques Highlights of well-known shortcomings that can lead to false conclusions An introduction to advanced techniques such as validation and bootstrapping Featuring examples that are engaging and non-application specific, the book appeals to a broad audience of students and professionals alike, specifically students of undergraduate statistics, managers, medical professionals, and anyone who has to make decisions based on raw data or compiled results.
A Course in Mathematical Statistics and Large Sample Theory by Rabi Bhattacharya; Lizhen Lin; Victor Patrangenaru1 Introduction -- 2 Decision Theory -- 3 Introduction to General Methods of Estimation -- 4 Sufficient Statistics, Exponential Families, and Estimation -- 5 Testing Hypotheses -- 6 Consistency and Asymptotic Distributions and Statistics -- 7 Large Sample Theory of Estimation in Parametric Models -- 8 Tests in Parametric and Nonparametric Models -- 9 The Nonparametric Bootstrap -- 10 Nonparametric Curve Estimation -- 11 Edgeworth Expansions and the Bootstrap -- 12 Frechet Means and Nonparametric Inference on Non-Euclidean Geometric Spaces -- 13 Multiple Testing and the False Discovery Rate -- 14 Markov Chain Monte Carlo (MCMC) Simulation and Bayes Theory -- 15 Miscellaneous Topics -- Appendices -- Solutions of Selected Exercises in Part 1.
Call Number: Online access -- ebook
Publication Date: 2016
Linear Regression by David J. OlivePart I: Python and Statistics -- Why Statistics? -- Python -- Data Input -- Display of Statistical Data -- Part II: Distributions and Hypothesis Tests -- Background -- Distributions of One Variable -- Hypothesis Tests -- Tests of Means of Numerical Data -- Tests on Categorical Data -- Analysis of Survival Times -- Part III: Statistical Modelling -- Linear Regression Models -- Multivariate Data Analysis -- Tests on Discrete Data -- Bayesian Statistics -- Solutions -- Glossary -- Index.
This textbook provides an introduction to the free software Python and its use for statistical data analysis. It covers common statistical tests for continuous, discrete and categorical data, as well as linear regression analysis and topics from survival analysis and Bayesian statistics. Working code and data for Python solutions for each test, together with easy-to-follow Python examples, can be reproduced by the reader and reinforce their immediate understanding of the topic. With recent advances in the Python ecosystem, Python has become a popular language for scientific computing, offering a powerful environment for statistical data analysis and an interesting alternative to R. The book is intended for master and PhD students, mainly from the life and medical sciences, with a basic knowledge of statistics. As it also provides some statistics background, the book can be used by anyone who wants to perform a statistical data analysis.
Call Number: Online access -- ebook
Publication Date: 2017
Logic of Statistical Inference by Ian Hacking; Jan-Willem Romeijn (Preface by)One of Ian Hacking's earliest publications, this book showcases his early ideas on the central concepts and questions surrounding statistical reasoning. He explores the basic principles of statistical reasoning and tests them, both at a philosophical level and in terms of their practical consequences for statisticians. Presented in a fresh twenty-first-century series livery, and including a specially commissioned preface written by Jan-Willem Romeijn, illuminating its enduring importance and relevance to philosophical enquiry, Hacking's influential and original work has been revived for a new generation of readers.
Mathematical Statistics by Peter J. Bickel; Kjell A. DoksumVolume I presents fundamental, classical statistical concepts at the doctorate level without using measure theory. It gives careful proofs of major results and explains how the theory sheds light on the properties of practical methods. Volume II covers a number of topics that are important in current measure theory and practice. It emphasizes nonparametric methods which can really only be implemented with modern computing power on large and complex data sets. In addition, the set includes a large number of problems with more difficult ones appearing with hints and partial solutions for the instructor.
Statistics and Analysis of Scientific Data by Massimiliano BonamenteThe revised second edition of this textbook provides the reader with a solid foundation in probability theory and statistics as applied to the physical sciences, engineering and related fields. It covers a broad range of numerical and analytical methods that are essential for the correct analysis of scientific data, including probability theory, distribution functions of statistics, fits to two-dimensional data and parameter estimation, Monte Carlo methods and Markov chains. Features new to this edition include: * a discussion of statistical techniques employed in business science, such as multiple regression analysis of multivariate datasets. * a new chapter on the various measures of the mean including logarithmic averages. * new chapters on systematic errors and intrinsic scatter, and on the fitting of data with bivariate errors. * a new case study and additional worked examples. * mathematical derivations and theoretical background material have been appropriately marked, to improve the readability of the text. * end-of-chapter summary boxes, for easy reference. As in the first edition, the main pedagogical method is a theory-then-application approach, where emphasis is placed first on a sound understanding of the underlying theory of a topic, which becomes the basis for an efficient and practical application of the material. The level is appropriate for undergraduates and beginning graduate students, and as a reference for the experienced researcher. Basic calculus is used in some of the derivations, and no previous background in probability and statistics is required. The book includes many numerical tables of data, as well as exercises and examples to aid the readers' understanding of the topic.
Statistical software with built-in visualization tools.
Data Mining -- Library Resources (starter list):
Big Data: a Very Short Introduction by Dawn E. Holmes (Contribution by)Since long before computers were even thought of, data has been collected and organized by diverse cultures across the world. Once access to the Internet became a reality for large swathes of the world's population, the amount of data generated each day became huge, and continues to growexponentially. It includes all our uploaded documents, video, and photos, all our social media traffic, our online shopping, even the GPS data from our cars."Big Data" represents a qualitative change, not simply a quantitative one. The term refers both to the new technologies involved, and to the way it can be used by business and government. Dawn E. Holmes uses a variety of case studies to explain how data is stored, analysed, and exploited by avariety of bodies from big companies to organizations concerned with disease control. Big data is transforming the way businesses operate, and the way medical research can be carried out. At the same time, it raises important ethical issues; Holmes discusses cases such as the Snowden affair, datasecurity, and domestic smart devices which can be hijacked by hackers.ABOUT THE SERIES: The Very Short Introductions series from Oxford University Press contains hundreds of titles in almost every subject area. These pocket-sized books are the perfect way to get ahead in a new subject quickly. Our expert authors combine facts, analysis, perspective, new ideas, andenthusiasm to make interesting and challenging topics highly readable.
Data Mining by Charu C. AggarwalThis textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues. It goes beyond the traditional focus on data mining problems to introduce advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. Until now, no single book has addressed all these topics in a comprehensive and integrated way. The chapters of this book fall into one of three categories: Fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems. Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data. Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor. Appropriate for both introductory and advanced data mining courses, Data Mining: The Textbook balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners (including those with a limited mathematical background). Numerous illustrations, examples, and exercises are included, with an emphasis on semantically interpretable examples. Praise for Data Mining: The Textbook - "As I read through this book, I have already decided to use it in my classes. This is a book written by an outstanding researcher who has made fundamental contributions to data mining, in a way that is both accessible and up to date. The book is complete with theory and practical use cases. It's a must-have for students and professors alike!" -- Qiang Yang, Chair of Computer Science and Engineering at Hong Kong University of Science and Technology "This is the most amazing and comprehensive text book on data mining. It covers not only the fundamental problems, such as clustering, classification, outliers and frequent patterns, and different data types, including text, time series, sequences, spatial data and graphs, but also various applications, such as recommenders, Web, social network and privacy. It is a great book for graduate students and researchers as well as practitioners." -- Philip S. Yu, UIC Distinguished Professor and Wexler Chair in Information Technology at University of Illinois at Chicago
Call Number: Online access -- ebook
Publication Date: 2015-04-13
Text mining - Library Resources (starter list):
Deep text : using text analytics to conquer information overload, get real value from social media, and add big(ger) text to big data by Tom ReamyIntroduction -- Text analytics basics -- Getting started in text analytics -- Text analytics development -- Text analytics applications -- Enterprise text analytics as a platform. Deep text is an approach to text analytics that adds depth and intelligence to our ability to utilize a growing mass of unstructured text the world is drowning in. Here, author Tom Reamy explains what deep text is and surveys its many uses and benefits. He describes applications and development best practices, discusses business issues including ROI, provides how-to advice and instruction, and offers guidance on selecting software and building a text analytics capability within an organization.
This is an important book for anyone who needs to be on the text analytics cutting edge from developers and information professionals who create, manage, and curate text-based and Big Data projects to entrepreneurs and business managers looking to cut costs and create new revenue streams. Whether you want to harness a flood of social media content or turn a mountain of business information into an organized and useful asset, Deep Text will supply the insights and examples you'll need to do it effectively.