LEADER 00000cam a2200637Ii 4500 001 903401639 003 OCoLC 005 20240129213017.0 006 m o d 007 cr unu|||||||| 008 150213s2015 maua ob 000 0 eng d 019 930870739 020 9780124173071 020 0124173071 020 0124172954 020 9780124172951 029 1 DEBBG|bBV042487434 029 1 DEBSZ|b434828327 029 1 GBVCP|b88284119X 035 (OCoLC)903401639|z(OCoLC)930870739 037 CL0500000545|bSafari Books Online 040 UMI|beng|erda|epn|cUMI|dDEBBG|dOCLCF|dDEBSZ|dOCLCQ|dSUE |dN9V|dCEF|dOCLCQ|dOCLCO|dOCLCQ|dOCLCO|dOCLCL 049 INap 082 04 005.1 082 04 005.1|223 099 eBook O'Reilly for Public Libraries 100 1 Menzies, Tim,|eauthor. 245 10 Sharing data and models in software engineering /|cTim Menzies, Ekrem Kocaguneli, Leandro Minku, Fayola Peters, Burak Turhan.|h[O'Reilly electronic resource] 250 First edition. 264 1 Waltham, MA :|bMorgan Kaufmann,|c[2015] 264 4 |c©2015 300 1 online resource (1 volume) :|billustrations 336 text|btxt|2rdacontent 337 computer|bc|2rdamedia 338 online resource|bcr|2rdacarrier 504 Includes bibliographical references. 505 0 Front Cover; Sharing Data and Models in Software Engineering; Copyright; Why this book?; Foreword; Contents; List of Figures; Chapter 1: Introduction; 1.1 Why Read This Book?; 1.2 What Do We Mean by S̀̀haring''?; 1.2.1 Sharing Insights; 1.2.2 Sharing Models; 1.2.3 Sharing Data; 1.2.4 Sharing Analysis Methods; 1.2.5 Types of Sharing; 1.2.6 Challenges with Sharing; 1.2.7 How to Share; 1.3 What? (Our Executive Summary); 1.3.1 An Overview; 1.3.2 More Details; 1.4 How to Read This Book; 1.4.1 Data Analysis Patterns; 1.5 But What About ...? (What Is Not in This Book); 1.5.1 What About B̀̀ig Data''? 505 8 1.5.2 What About Related Work?1.5.3 Why All the Defect Prediction and Effort Estimation?; 1.6 Who? (About the Authors); 1.7 Who Else? (Acknowledgments); Part I: Data Mining for Managers; Chapter 2: Rules for Managers; 2.1 The Inductive Engineering Manifesto; 2.2 More Rules; Chapter 3: Rule #1: Talk to the Users; 3.1 Users Biases; 3.2 Data Mining Biases; 3.3 Can We Avoid Bias?; 3.4 Managing Biases; 3.5 Summary; Chapter 4: Rule #2: Know the Domain; 4.1 Cautionary Tale #1: D̀̀iscovering'' Random Noise; 4.2 Cautionary Tale #2: Jumping at Shadows; 4.3 Cautionary Tale #3: It Pays to Ask. 505 8 4.4 SummaryChapter 5: Rule #3: Suspect Your Data; 5.1 Controlling Data Collection; 5.2 Problems with Controlled Data Collection; 5.3 Rinse (and Prune) Before Use; 5.3.1 Row Pruning; 5.3.2 Column Pruning; 5.4 On the Value of Pruning; 5.5 Summary; Chapter 6: Rule #4: Data Science Is Cyclic; 6.1 The Knowledge Discovery Cycle; 6.2 Evolving Cyclic Development; 6.2.1 Scouting; 6.2.2 Surveying; 6.2.3 Building; 6.2.4 Effort; 6.3 Summary; Part II: Data Mining: A Technical Tutorial; Chapter 7: Data Mining and SE; 7.1 Some Definitions; 7.2 Some Application Areas; Chapter 8: Defect Prediction. 505 8 8.1 Defect Detection Economics8.2 Static Code Defect Prediction; 8.2.1 Easy to Use; 8.2.2 Widely Used; 8.2.3 Useful; Chapter 9: Effort Estimation; 9.1 The Estimation Problem; 9.2 How to Make Estimates; 9.2.1 Expert-Based Estimation; 9.2.2 Model-Based Estimation; 9.2.3 Hybrid Methods; Chapter 10: Data Mining (Under the Hood); 10.1 Data Carving; 10.2 About the Data; 10.3 Cohen Pruning; 10.4 Discretization; 10.4.1 Other Discretization Methods; 10.5 Column Pruning; 10.6 Row Pruning; 10.7 Cluster Pruning; 10.7.1 Advantages of Prototypes; 10.7.2 Advantages of Clustering; 10.8 Contrast Pruning. 505 8 10.9 Goal Pruning10.10 Extensions for Continuous Classes; 10.10.1 How RTs Work; 10.10.2 Creating Splits for Categorical Input Features; 10.10.3 Splits on Numeric Input Features; 10.10.4 Termination Condition and Predictions; 10.10.5 Potential Advantages of RTs for Software Effort Estimation; 10.10.6 Predictions for Multiple Numeric Goals; Part III: Sharing Data; Chapter 11 : Sharing Data: Challenges and Methods; 11.1 Houston, We Have a Problem; 11.2 Good News, Everyone; Chapter 12: Learning Contexts; 12.1 Background; 12.2 Manual Methods for Contextualization; 12.3 Automatic Methods. 520 Data Science for Software Engineering: Sharing Data and Models presents guidance and procedures for reusing data and models between projects to produce results that are useful and relevant. Starting with a background section of practical lessons and warnings for beginner data scientists for software engineering, this edited volume proceeds to identify critical questions of contemporary software engineering related to data and models. Learn how to adapt data from other organizations to local problems, mine privatized data, prune spurious information, simplify complex results, how to update models for new platforms, and more. Chapters share largely applicable experimental results discussed with the blend of practitioner focused domain expertise, with commentary that highlights the methods that are most useful, and applicable to the widest range of projects. Each chapter is written by a prominent expert and offers a state-of-the-art solution to an identified problem facing data scientists in software engineering. Throughout, the editors share best practices collected from their experience training software engineering students and practitioners to master data science, and highlight the methods that are most useful, and applicable to the widest range of projects. 588 0 Online resource; title from title page (Safari, viewed January 28, 2015). 590 O'Reilly|bO'Reilly Online Learning: Academic/Public Library Edition 650 0 Software engineering. 650 0 Data structures (Computer science) 650 6 Génie logiciel. 650 6 Structures de données (Informatique) 650 7 Data structures (Computer science)|2fast 650 7 Software engineering|2fast 700 1 Kocaguneli, Ekrem,|eauthor. 700 1 Turhan, Burak,|eauthor. 700 1 Minku, Leandro,|eauthor. 700 1 Peters, Fayola,|eauthor. 856 40 |uhttps://ezproxy.naperville-lib.org/login?url=https:// learning.oreilly.com/library/view/~/9780124172951/?ar |zAvailable on O'Reilly for Public Libraries 994 92|bJFN