Pattern Recognition and Machine Learning (Information Science and Statistics)
by Christopher M. Bishop
from Springer
The dramatic growth in practical applications for machine learning over the last ten years has been accompanied by many important developments in the underlying algorithms and techniques. For example, Bayesian methods have grown from a specialist niche to become mainstream, while graphical models have emerged as a general framework for describing and applying probabilistic techniques. The practical applicability of Bayesian methods has been greatly enhanced by the development of a range of approximate inference algorithms such as variational Bayes and expectation propagation, while new models based on kernels have had a significant impact on both algorithms and applications.
This completely new textbook reflects these recent developments while providing a comprehensive introduction to the fields of pattern recognition and machine learning. It is aimed at advanced undergraduates or first-year PhD students, as well as researchers and practitioners. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.
The book is suitable for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bioinformatics. Extensive support is provided for course instructors, including more than 400 exercises, graded according to difficulty. Example solutions for a subset of the exercises are available from the book web site, while solutions for the remainder can be obtained by instructors from the publisher. The book is supported by a great deal of additional material, and the reader is encouraged to visit the book web site for the latest information.
Coming soon:
*For students, worked solutions to a subset of exercises available on a public web site (for exercises marked "www" in the text)
*For instructors, worked solutions to remaining exercises from the Springer web site
*Lecture slides to accompany each chapter
*Data sets available for download
Programming Collective Intelligence: Building Smart Web 2.0 Applications
by Toby Segaran
from O'Reilly Media, Inc.
Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it. Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains: Collaborative filtering techniques that enable online retailers to recommend products or media Methods of clustering to detect groups of similar items in a large dataset Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm Optimization algorithms that search millions of possible solutions to a problem and choose the best one Bayesian filtering, used in spam filters for classifying documents based on word types and other features Using decision trees not only to make predictions, but to model the way decisions are made Predicting numerical values rather than classifications to build price models Support vector machines to match people in online dating sites Non-negative matrix factorization to find the independent features in adataset Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a game Each chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details." -- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today. If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Speech and Language Processing (2nd Edition) (Prentice Hall Series in Artificial Intelligence)
by Daniel Jurafsky
from Prentice Hall
An explosion of Web-based language techniques, merging of distinct fields, availability of phone-based dialogue systems, and much more make this an exciting time in speech and language processing. The first of its kind to thoroughly cover language technology – at all levels and with all modern technologies – this book takes an empirical approach to the subject, based on applying statistical and other machine-learning algorithms to large corporations. Builds each chapter around one or more worked examples demonstrating the main idea of the chapter, usingthe examples to illustrate the relative strengths and weaknesses of various approaches. Adds coverage of statistical sequence labeling, information extraction, question answering and summarization, advanced topics in speech recognition, speech synthesis. Revises coverage of language modeling, formal grammars, statistical parsing, machine translation, and dialog processing. A useful reference for professionals in any of the areas of speech and language processing.
Foundations of Statistical Natural Language Processing
by Christopher D. Manning
from The MIT Press
Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)
by Bing Liu
from Springer
Web mining aims to discover useful information and knowledge from the Web hyperlink structure, page contents, and usage data. Although Web mining uses many conventional data mining techniques, it is not purely an application of traditional data mining due to the semistructured and unstructured nature of the Web data and its heterogeneity. It has also developed many of its own algorithms and techniques.
Liu has written a comprehensive text on Web data mining. Key topics of structure mining, content mining, and usage mining are covered both in breadth and in depth. His book brings together all the essential concepts and algorithms from related areas such as data mining, machine learning, and text processing to form an authoritative and coherent text.
The book offers a rich blend of theory and practice, addressing seminal research ideas, as well as examining the technology from a practical point of view. It is suitable for students, researchers and practitioners interested in Web mining both as a learning text and a reference book. Lecturers can readily use it for classes on data mining, Web mining, and Web search. Additional teaching materials such as lecture slides, datasets, and implemented algorithms are available online.
Real Sound Synthesis for Interactive Applications (Book & CD-ROM)
by Perry R. Cook
from AK Peters, Ltd.
...emphasizes physical modeling of sound and focuses on real-world interactive sound effects...intended for game developers, graphics programmers, developers of virtual reality systems and training simulators.
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (Prentice Hall Series in Artificial Intelligence)
by Daniel Jurafsky
from Prentice Hall
This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter. Covers the fundamental algorithms of various fields, whether originally proposed for spoken or written language to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation. Emphasis on web and other practical applications. Emphasis on scientific evaluation. Useful as a reference for professionals in any of the areas of speech and language processing.
Text Mining Application Programming (Programming Series)
by Manu Konchady
from Charles River Media
Text Mining Application Programming teaches software developers how to mine the vast amounts of information available on the Web, internal networks, and desktop files and turn it into usable data. The book helps developers understand the problems associated with managing unstructured text, and explains how to build your own mining tools using standard statistical methods from information theory, artificial intelligence, and operations research. Each of the topics covered are thoroughly explained and then a practical implementation is provided. The book begins with a brief overview of text data, where it can be found, and the typical search engines and tools used to search and gather this text. It details how to build tools for extracting and using the text, and covers the mathematics behind many of the algorithms used in building these tools. From there you'll learn how to build tokens from text, construct indexes, and detect patterns in text. You'll also find methods to extract the names of people, places, and organizations from an email, a news article, or a Web page. The next portion of the book teaches you how to find information on the Web, the structure of the Web, and how to build spiders to crawl the Web. Text categorization is also described in the context of managing email. The final part of the book covers information monitoring, summarization, and a simple Question & Answer (Q&A) system. The code used in the book is written in Perl, but knowledge of Perl is not necessary to run the software. Developers with an intermediate level of experience with Perl can customize the software. Although the book is about programming, methods are explained with English-like pseudocode and the source code is provided on the CD-ROM. After reading this book, you'll be ready to tap into the bevy of information available online in ways you never thought possible.
Spoken Language Processing: A Guide to Theory, Algorithm and System Development
by Xuedong Huang
from Prentice Hall PTR
Using MPI - 2nd Edition: Portable Parallel Programming with the Message Passing Interface (Scientific and Engineering Computation)
by William Gropp
from The MIT Press
The Message Passing Interface (MPI) specification is widely used for solving significant scientific and engineering problems on parallel computers. There exist more than a dozen implementations on computer platforms ranging from IBM SP-2 supercomputers to clusters of PCs running Windows NT or Linux ("Beowulf" machines). The initial MPI Standard document, MPI-1, was recently updated by the MPI Forum. The new version, MPI-2, contains both significant enhancements to the existing MPI core and new features.
Using MPI is a completely up-to-date version of the authors' 1994 introduction to the core functions of MPI. It adds material on the new C++ and Fortran 90 bindings for MPI throughout the book. It contains greater discussion of datatype extents, the most frequently misunderstood feature of MPI-1, as well as material on the new extensions to basic MPI functionality added by the MPI-2 Forum in the area of MPI datatypes and collective operations.
Using MPI-2 covers the new extensions to basic MPI. These include parallel I/O, remote memory access operations, and dynamic process management. The volume also includes material on tuning MPI applications for high performance on modern MPI implementations.
+++


