Data Preparation for Data Mining Using SASElsevier, 27. 7. 2010 - Počet stran: 424 Are you a data mining analyst, who spends up to 80% of your time assuring data quality, then preparing that data for developing and deploying predictive models? And do you find lots of literature on data mining theory and concepts, but when it comes to practical advice on developing good mining views find little “how to information? And are you, like most analysts, preparing the data in SAS?This book is intended to fill this gap as your source of practical recipes. It introduces a framework for the process of data preparation for data mining, and presents the detailed implementation of each step in SAS. In addition, business applications of data mining modeling require you to deal with a large number of variables, typically hundreds if not thousands. Therefore, the book devotes several chapters to the methods of data transformation and variable selection.
|
Obsah
1 | |
7 | |
15 | |
A QUICK START | 29 |
CHAPTER 5 DATA ACQUISITION AND INTEGRATION | 43 |
CHAPTER 6 INTEGRITY CHECKS | 63 |
CHAPTER 7 EXPLORATORY DATA ANALYSIS | 83 |
CHAPTER 8 SAMPLING AND PARTITIONING | 99 |
CHAPTER 13 ANALYSIS OF NOMINAL AND ORDINAL VARIABLES | 211 |
CHAPTER 14 ANALYSIS OF CONTINUOUS VARIABLES | 233 |
CHAPTER 15 PRINCIPAL COMPONENT ANALYSIS | 247 |
CHAPTER 16 FACTOR ANALYSIS | 257 |
CHAPTER 17 PREDICTIVE POWER AND VARIABLE REDUCTION II | 267 |
CHAPTER 18 PUTTING IT ALL TOGETHER | 279 |
APPENDIX LISTING OF SAS MACROS | 297 |
373 | |
CHAPTER 9 DATA TRANSFORMATIONS | 115 |
CHAPTER 10 BINNING AND REDUCTION OF CARDINALITY | 141 |
CHAPTER 11 TREATMENT OF MISSING VALUES | 171 |
CHAPTER 12 PREDICTIVE POWER AND VARIABLE REDUCTION I | 207 |
375 | |
About the Author | 393 |
Další vydání - Zobrazit všechny
Běžně se vyskytující výrazy a sousloví
&IDVar &Nvars &var &WarX &Xvar algorithms binary binning calculate call symput categorical variables cluster contingency table continuous variables create table credit card CustID Data &DSout data mining data null_ data preparation data warehouse Database decision tree defined delete dependent variable Description DSin Input distribution DSin Input dataset DSout eigenvalues Equation example extract factor analysis finish the macro frequencies function Gini ratio Header imputed values likelihood function linear regression logistic regression loop macro variables mapping mend merge methods mining view missing pattern missing values nolist nominal variables number of records ordinal variables outliers output dataset p-value Parameter Description DSin Parameters of macro partitions population principal component proc datasets library=work proc sort proc sql noprint PROC UNIVARIATE procedures quit regression model rollup sample scoring view select count Set &DSin split temp_freqs Temp_TERM transaction transformation variable names variance VarList Xvar
Oblíbené pasáže
Strana ii - Sheth Object-Relational DBMSs: Tracking the Next Great Wave, Second Edition Michael Stonebraker and Paul Brown with Dorothy Moore A Complete Guide to DB2 Universal Database Don Chamberlin Universal Database Management: A Guide to Object/Relational Technology Cynthia Maro Saracco Readings in Database Systems, Third Edition Edited by Michael Stonebraker and Joseph M. Hellerstein Understanding...
Strana ii - Data on the Web: From Relations to Semistructured Data and XML Serge Abiteboul, Peter Buneman, and Dan Suciu Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations Ian Witten and Eibe Frank Joe Celko's SQL for Smarties: Advanced SQL Programming, Second Edition Joe Celko Joe Celko's Data and Databases: Concepts in Practice Joe Celko Developing Time-Oriented Database Applications in SQL Richard T. Snodgrass Web Farming for the Data Warehouse Richard D. Hackathorn...
Strana ii - Interfaces, & the Incremental Approach Michael L. Brodie and Michael Stonebraker Atomic Transactions Nancy Lynch, Michael Merritt, William Weihl, and Alan Fekete Query Processing for Advanced Database Systems Edited by Johann Christoph Freytag, David Maier, and Gottfried Vossen Transaction Processing: Concepts and Techniques Jim Gray and Andreas Reuter Understanding the New SQL: A Complete Guide Jim Melton and Alan R.
Strana ii - JDBC, and Related Technologies • Jim Melton and Andrew Eisenberg Database: Principles, Programming, and Performance, Second Edition • Patrick and Elizabeth O'Neil The Object Data Standard: ODMG 3.0 • Edited by RGG Cattell and Douglas K. Barry Data on the Web: From Relations to Semistructured Data and XML • Serge Abiteboul, Peter Buneman...
Strana ii - Transactional Information Systems: Theory, Algorithms, and Practice of Concurrency Control and Recovery Gerhard Weikum and Gottfried Vossen Spatial Databases: With Application to GIS Philippe Rigaux, Michel Scholl, and Agnes Voisard Information Modeling and Relational Databases: From Conceptual Analysis to Logical Design Terry Halpin Component Database Systems Edited by Klaus R. Dittrich and Andreas Geppert Managing Reference Data in Enterprise Databases: Binding Corporate...
Strana ii - Philippe Rigaux, Michel Scholl, and Agnes Voisard Information Modeling and Relational Databases: From Conceptual Analysis to Logical Design Terry Halpin Component Database Systems Edited by Klaus R. Dittrich and Andreas Geppert Managing Reference Data in Enterprise Databases: Binding Corporate Data to the Wider World Malcolm Chisholm Data Mining: Concepts and Techniques Jiawei Han and Micheline Kamber Understanding SQL and Java Together: A Guide to...
Strana ii - Giiting and Markus Schneider Joe Celko's SQL Programming Style Joe Celko Data Mining, Second Edition: Concepts and Techniques Ian Witten and Eibe Frank Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration Earl Cox Data Modeling Essentials, Third Edition Graeme C. Simsion and Graham...
Strana ii - Database Transaction Models for Advanced Applications, Edited by Ahmed K. Elmagarmid A Guide to Developing Client/Server SQL Applications, Setrag Khoshafian, Arvola Chan, Anna Wong, and Harry KT Wong The Benchmark Handbook for Database and Transaction Processing Systems, Second Edition, Edited by Jim Gray Camelot and Avalon: a Distributed Transaction Facility, Edited by Jeffrey L. Eppinger, Lily B. Mummert, and Alfred Z. Spector Readings in Object-Oriented Database Systems, Edited by Stanley B. Zdonik...
Strana ii - ... Modeling Essentials, Third Edition Graeme C. Simsion and Graham C. Witt Location-Based Services Jochen Schiller and Agnes Voisard Database Modeling with Microsft® Visio for Enterprise Architects Terry Halpin, Ken Evans, Patrick Hallock, Bill Maclean Designing Data-Intensive Web Applications Stephano Ceri, Piero...