Machine Learning Introduction

1. Machine Learning: Introduction Book reading: 2014 summer Jinseob Kim GSPH, SNU October 18, 2014 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 1 / 55

2. What is Machine Learning? ôè0 YµXì !` ˆÄ] !¨(prediction)D X” xõÀ¥X „|. Computer science + Statistics ?? Amazon, Google, Facebook.. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 2 / 55

3. Œ IT0Å `0ÄYµ' Ñ http://www.dt.co.kr/contents. html?article_no=2014062002010960718002 8Ä” À xõÀ¥ ô 6pìì è$X m@ `]' http://vip.mk.co.kr/news/view/21/20/1178659.html MS t|°Ü, `8àìÝ' tÄä http://www.bloter.net/archives/196341 $t” 5 ü” 0 ü `%ìÝ' http://www.wikitree.co.kr/main/news_view.php?id=157174 xõÀ¥ Ü lX èt¼ ¸ http://weekly.chosun. com/client/news/viw.asp?nNewsNumb=002311100009ctcd=C02 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 3 / 55

4. Overview Contents 1 Overview Interpretation vs Prediction Types of Machine Learning Techniques 2 Book Reading Plan Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 4 / 55

5. Overview Interpretation vs Prediction Objective of statistics 1 ÀÝX U¥, Causal inference µÄY Pearson: äX ÄT` …D Xì.. 2 X¬° µÄY R.A Fisher: ¥ 1¥t ‹@ DÌ Ý Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 5 / 55

6. Overview Interpretation vs Prediction Statistics in Epidemiology Causal inference: Ðxt 4Çx? tt ˜” ¨t ñtä. xüÄ ”`. è ¨ 8. Å½ÀX èÄ ”(Kilometer VS meter, centering issue)

7. , Odds Ratio(OR), Hazard Ratio(HR), p-value, AIC Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 6 / 55

8. Overview Interpretation vs Prediction Statistics in Machine Learning Prediction: ^ ´»Œ ƒx? !%t ‹@ ƒt ñtä. õ¡ ¨Ä ÁÆä. !Ì ¨( ˜ ät. D”Ð 0| Å½ÀäD ¬ ¼ä. (Scale change) ^ Y , ^p, Cross-validation, Accuracy, ROC curve Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 7 / 55

9. Overview Interpretation vs Prediction Example: Logistic regression Binomial data| äè” % µÄ„)•. ¹ˆ epidemiologic studyÐ” x À.

10. ! Odds Ratio(OR) : tt }ä. But.. Logit function... Ä°t ´$ÌÀ” Ðx. Heritability issue of binomial trait?? Logith ”x.. Probit modelt Ht ˆä. Ä°}ä.

11. t ´5ä.. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 8 / 55

12. Overview Interpretation vs Prediction Logit VS Probit Figure. Logit VS Probit Logit: Pr(Y = 1 j X) = [1 + eX0

13. ]1 Probit: Pr(Y = 1 j X) = (X0

14. ) Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 9 / 55

15. Overview Interpretation vs Prediction Example: Polygenic model(GenABEL) Figure. Time comparison: Probit vs Logit ø¬ missing˜¬ Ä GenABELpXÀ Jà hglm package Xt, kinship matrix decomposition(Singular Value Decomposition) üD ˆÌ ` ˆ´ Ü T è• ¥ Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 10 / 55

16. Overview Interpretation vs Prediction Figure. Fixed dispersion parameter(= 1) Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 11 / 55

17. Overview Interpretation vs Prediction Example2: Cox proportional hazard model Censored data„X . http: //www.theriac.org/DeskReference/viewDocument.php?id=188 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 12 / 55

18. Overview Interpretation vs Prediction http://www.uni-kiel.de/psychologie/rexrepos/posts/ survivalCoxPH.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 13 / 55

19. Overview Interpretation vs Prediction Assumptions ln (t) = ln 0(t) +

20. 1X1 + +

21. pXp = ln 0(t) + X

22. (t) = 0(t) e

23. 1X1++

24. pXp = 0(t) eX

25. S(t) = S0(t)exp(X

26. ) = exp 0(t) eX

27. (t) = 0(t) eX

28. (t) 0(t) = eX

30. : Hazard Ratio(HR) Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 14 / 55

31. Overview Interpretation vs Prediction Hazard Ratio t ¸Xä. Odd Ratio . But, t Ît ä´ä. Ýt õ¡t Ä°t ´5ä. Conditional Logistic Regression.. PredictionÐÄ Cox| àÑ` D”” Æä. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 15 / 55

32. Overview Interpretation vs Prediction Alternatives Yi : Time of event Not censored p(yi ji ; 2) = (22)1 2 expf (yi i )2 22 g Censored p(yi ti ji ; 2) = Z 1 ti (22)1 2 expf (yi i )2 22 g@yi = ( i ti ) Ü„ìX CDF èˆ ! Ä°t }ä!! Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 16 / 55

33. Overview Interpretation vs Prediction Example3: Correlation Structure Example: Pedigree structure à$t|X˜? 1 Genome-Wide Association Study(GWAS): Important

34. X s.e ä. ! p-value ä. 2 Prediction model: Not important

35. ´” lŒ Hä.! ^ Y ; ^p” ˜ Hä. Pedigree : Unmeasured polygenic eect ! !À J@ ƒ@ New dataÐ prediction` L t©` Æä. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 17 / 55

36. Overview Interpretation vs Prediction Our data SNP Chromosome Position A1 A2 N Beta SE P Beta FASTA SE FASTA P FASTA rs2801233 21 13525448 T C 1799 -2.78 2.45 0.258 -3.05 2.62 0.244 rs2801294 21 13557024 C G 1830 -2.12 2.78 0.447 -1.94 2.95 0.510 rs2260895 21 13564335 C T 1815 -3.04 2.77 0.273 -2.79 2.94 0.343 rs2821796 21 13571669 A C 1833 -6.13 2.45 0.012 -6.29 2.59 0.015 rs2742182 21 13587844 T C 1819 -2.29 2.77 0.407 -2.18 2.93 0.458 rs2259207 21 13598778 T C 1804 -3.35 3.03 0.269 -4.45 3.17 0.160 rs2259403 21 13615252 G A 1818 -6.07 2.48 0.014 -6.08 2.60 0.020 rs2821847 21 13689440 A G 1817 -2.10 2.87 0.463 -2.10 2.98 0.482 rs2821849 21 13691411 T C 1816 -1.74 2.72 0.522 -1.13 2.82 0.688 rs2747265 21 13696956 C G 1819 -18.96 10.75 0.078 -14.97 11.27 0.184 Table. No pedigree VS pedigree : TG-GWAS Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 18 / 55

37. Overview Interpretation vs Prediction Figure. A representation of the tradeo between exibility and interpretability, using dierent statistical learning methods. In general, as the exibility of a method increases, its interpretability decreases[3] Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 19 / 55

38. Overview Interpretation vs Prediction Catching crumbs from the table In the face of metahuman science, humans have become metascientists. Ted Chiang It has been 25 years since a report of origi-nal research was last submitted to our editors for publication, making this an appropriate time to revisit the question that was so widely debated then: what is the role of human scientists in an age when the frontiers of scientific inquiry have moved beyond the comprehensibility of humans? No doubt many of our subscribers remember reading papers whose authors were the first individuals ever to obtain the results they described. But as metahumans began to dominate experimental research, they increasingly made their findings avail-able only via DNT (digital neural transfer), leaving journals to publish second-hand accounts translated into human language. Without DNT, humans could not fully grasp earlier developments nor effectively utilize the new tools needed to conduct research, while metahumans continued to improve DNT and rely on it even more. Jour-nals for human audiences were reduced to vehicles of popularization, and poor ones at that, as even the most brilliant humans found themselves puzzled by translations of the latest findings. No one denies the many benefits of metahuman science, but one of its costs to human researchers was the realization that they would probably never make an original contribution to science again. Some left the field altogether, but those who stayed shifted their attentions away from original research and toward hermeneutics: interpreting the scientific work of metahumans. Textual hermeneutics became popular first, since there were already terabytes of metahuman publications whose transla-tions, although cryptic, were presumably not entirely inaccurate. Deciphering these texts bears little resemblance to the task per-formed by traditional palaeographers, but progress continues: recent experiments have validated the Humphries decipherment of decade-old publications on histocompati-bility genetics. The availability of devices based on metahuman science gave rise to artefact hermeneutics. Scientists began attempting to ‘reverse engineer’ these artefacts, their goal being not to manufacture competing products, but simply to understand the physical principles underlying their opera-tion. The most common technique is the crystallographic analysis of nanoware appli-entific futures inquiry and increases the body of human knowledge just as original research did. Moreover, human researchers may discern applications overlooked by meta-humans, whose advantages tend to make them unaware of our concerns. For example, imagine if research offered hope of a different intelligence-enhancing therapy, one that would allow individuals to gradually ‘upgrade’ their minds to a level equivalent to that of a metahuman. Such a therapy would offer a bridge across what has become the greatest cultural divide in our species’ history, yet it might not even occur to metahumans to explore it; that possibility alone justifies the continuation of human research. We need not be intimidated by the accomplishments of metahuman science. We should always remember that the tech-nologies that made metahumans possible were originally invented by humans, and they were no smarter than we. n Ted Chiang is an occasional writer of science fiction. His latest story can be found in the anthology Vanishing Acts, published by Tor Books. ances, which frequently provides us with new insights into mechanosynthesis. The newest and by far the most speculative mode of inquiry is remote sensing of metahuman research facilities. A recent target of investigation is the ExaCollider recently installed beneath the Gobi Desert, whose puzzling neutrino signature has been the subject of much controversy. (The portable neutrino detector is, of course, another metahuman arte-fact whose oper-ating principles remain elusive.) The question is, are these worthwhile undertakings for sci-entists? Some call them a waste of time, likening them to a Native American research effort into bronze smelting when steel tools of European manufacture are readily available. This comparison might be more apt if humans were in competition with metahumans, but in today’s economy of abundance there is no evidence of such competition. In fact, it is important to recognize that — unlike most previous low-technology cultures confronted with a high-technology one — humans are in no danger of assimilation or extinction. There is still no way to augment a human brain into a metahuman one; the Sugimoto gene therapy must be performed before the embryo begins neurogenesis in order for a brain to be compatible with DNT. This lack of an assimilation mechanism means that human parents of a metahuman child face a difficult choice: to allow their child DNT interaction with metahuman culture, and watch him or her grow incomprehensible to them; or else restrict access to DNT during the child’s formative years, which to a metahuman is deprivation like that suffered by Kaspar Hauser. It is not surprising that the percentage of human parents choosing the Sugimoto gene therapy for their children has dropped almost to zero in recent years. As a result, human culture is likely to sur-vive well into the future, and the scientific tradition is a vital part of that culture. Hermeneutics is a legitimate method of sci- NATURE|VOL 405 | 1 JUNE 2000 |www.nature.com 517 JACEY © 2000 Macmillan Magazines Ltd Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 20 / 55

39. Overview Interpretation vs Prediction Human VS metahuman[1] Ted Chiang : SF Œ$ TÀ xX(xõÀ¥)X UÄx ÀÝ˜¬¥%. Human science: TÀ xX ¸ ƒäD tX” ÄX . TÀ xXX |8D ˆíX” ƒt human science.. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 21 / 55

40. Overview Types of Machine Learning 0ÄYµX …X Supervised learning Labeled data Regression, classi

41. cation... Unsupervised learning Unlabeled data Semi-supervised learning Labeled + Unlabeled data (ex: censored data, missing data) Reinforcement learning Reward Etc: : : Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 22 / 55

42. Overview Types of Machine Learning http: //www.astroml.org/sklearn_tutorial/general_concepts.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 23 / 55

43. Overview Types of Machine Learning Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 24 / 55

44. Overview Types of Machine Learning Shah A R et al. Bioinformatics 2008;24:783-790[8] Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 25 / 55

45. Overview Types of Machine Learning http://www.cns.atr.jp/cnb/crp/ Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 26 / 55

46. Overview Types of Machine Learning http://www2.hawaii.edu/~chenx/ics699rl/grid/rl.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 27 / 55

47. Overview Techniques Techniques k-Nearest Neighbors(kNN) Neural Network K-Means Clustering Principal Component Analysis Tree(Bagging, Boosting, Ensemble) Support Vector Machine Naive Bayes Etc.. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 28 / 55

48. Overview Techniques k-Nearest Neighbors(kNN) useR 2014 tutorial: Applied Predictive Modelling http://appliedpredictivemodeling.com/s/Applied_Predictive_ Modeling_in_R.pdf Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 29 / 55

49. Overview Techniques Neural Network Human brain VS Computer 3431 3324 =?? @ à‘t lÄ, L1xÝ, 8xÝ Sequential VS Parallel Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 30 / 55

50. Overview Techniques Neuron Arti

51. cial Neural Network(ANN)[7] Figure. (A) Human neuron; (B) arti

52. cial neuron or hidden unity; (C) biological synapse; (D) ANN synapses. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 31 / 55

53. Overview Techniques http://www.nd.com/welcome/whatisnn.htm Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 32 / 55

54. Overview Techniques Deep Neural Network(DNN) ' Deep Learning Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 33 / 55

55. Overview Techniques Examples: Cat recognition 16,000X CPU ø¼Ì ôà à‘t xÝ (Unsupervised Learning) GPU| t©Xì Computing Ü „. http: //www.asiae.co.kr/news/view.htm?idxno=2012062708351993171 http://googleblog.blogspot.kr/2012/06/ using-large-scale-brain-simulations-for.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 34 / 55

56. Overview Techniques Paper[5, 2] Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le quocle@cs.stanford.edu Marc’Aurelio Ranzato ranzato@google.com Rajat Monga rajatmonga@google.com Matthieu Devin mdevin@google.com Kai Chen kaichen@google.com Greg S. Corrado gcorrado@google.com Jeff Dean jeff@google.com Andrew Y. Ng ang@cs.stanford.edu Abstract We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it pos-sible to learn a face detector using only unla-beled images? To answer this, we train a 9- layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 mil-lion 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bod-ies. Starting with these learned features, we trained our network to obtain 15.8% accu-racy in recognizing 22,000 object categories from ImageNet, a leap of 70% relative im-provement over the previous state-of-the-art. Appearing in Proceedings of the 29 th International Confer-ence on Machine Learning, Edinburgh, Scotland, UK, 2012. Copyright 2012 by the author(s)/owner(s). 1. Introduction The focus of this work is to build high-level, class-specific feature detectors from unlabeled images. For instance, we would like to understand if it is possible to build a face detector from only unlabeled images. This approach is inspired by the neuroscientific conjecture that there exist highly class-specific neurons in the hu-man brain, generally and informally known as “grand-mother neurons.” The extent of class-specificity of neurons in the brain is an area of active investigation, but current experimental evidence suggests the possi-bility that some neurons in the temporal cortex are highly selective for object categories such as faces or hands (Desimone et al., 1984), and perhaps even spe-cific people (Quiroga et al., 2005). Contemporary computer vision methodology typically emphasizes the role of labeled data to obtain these class-specific feature detectors. For example, to build a face detector, one needs a large collection of images labeled as containing faces, often with a bounding box around the face. The need for large labeled sets poses a significant challenge for problems where labeled data are rare. Although approaches that make use of inex-pensive unlabeled data are often preferred, they have not been shown to work well for building high-level features. This work investigates the feasibility of building high-level features from only unlabeled data. A positive answer to this question will give rise to two significant results. Practically, this provides an inexpensive way to develop features from unlabeled data. But perhaps more importantly, it answers an intriguing question as to whether the specificity of the “grandmother neuron” could possibly be learned from unlabeled data. Infor-mally, this would suggest that it is at least in principle possible that a baby learns to group faces into one class Deep learning with COTS HPC systems Adam Coates acoates@cs.stanford.edu Brody Huval brodyh@stanford.edu Tao Wang twangcat@stanford.edu David J. Wu dwu4@cs.stanford.edu Andrew Y. Ng ang@cs.stanford.edu Stanford University Computer Science Dept., 353 Serra Mall, Stanford, CA 94305 USA Bryan Catanzaro bcatanzaro@nvidia.com NVIDIA Corporation, 2701 San Tomas Expressway, Santa Clara, CA 95050 Abstract Scaling up deep learning algorithms has been shown to lead to increased performance in benchmark tasks and to enable discovery of complex high-level features. Recent efforts to train extremely large networks (with over 1 billion parameters) have relied on cloud-like computing infrastructure and thousands of CPU cores. In this paper, we present tech-nical details and results from our own sys-tem based on Commodity Off-The-Shelf High Performance Computing (COTS HPC) tech-nology: a cluster of GPU servers with Infini-band interconnects and MPI. Our system is able to train 1 billion parameter networks on just 3 machines in a couple of days, and we show that it can scale to networks with over 11 billion parameters using just 16 machines. As this infrastructure is much more easily marshaled by others, the approach enables much wider-spread research with extremely large neural networks. 1. Introduction A significant amount of effort has been put into de-veloping deep learning systems that can scale to very large models and large training sets. With each leap in scale new results proliferate: large models in the literature are now top performers in supervised vi-sual recognition tasks (Krizhevsky et al., 2012; Cire-san et al., 2012; Le et al., 2012), and can even learn Proceedings of the 30 th International Conference on Ma-chine Learning, Atlanta, Georgia, USA, 2013. JMLR: WCP volume 28. Copyright 2013 by the author(s). to detect objects when trained from unlabeled im-ages alone (Coates et al., 2012; Le et al., 2012). The very largest of these systems has been constructed by Le et al. (Le et al., 2012) and Dean et al. (Dean et al., 2012), which is able to train neural networks with over 1 billion trainable parameters. While such extremely large networks are potentially valuable objects of AI research, the expense to train them is overwhelming: the distributed computing infrastructure (known as “DistBelief”) used for the experiments in (Le et al., 2012) manages to train a neural network using 16000 CPU cores (in 1000 machines) in just a few days, yet this level of resource is likely beyond those available to most deep learning researchers. Less clear still is how to continue scaling significantly beyond this size of network. In this paper we present an alternative approach to training such networks that leverages in-expensive computing power in the form of GPUs and introduces the use of high-speed communications in-frastructure to tightly coordinate distributed gradient computations. Our system trains neural networks at scales comparable to DistBelief with just 3 machines. We demonstrate the ability to train a network with more than 11 billion parameters—6.5 times larger than the model in (Dean et al., 2012)—in only a few days with 2% as many machines. Buoyed by many empirical successes (Uetz Behnke, 2009; Raina et al., 2009; Ciresan et al., 2012; Krizhevsky, 2010; Coates et al., 2011) much deep learning research has focused on the goal of building larger models with more parameters. Though some techniques (such as locally connected networks (Le- Cun et al., 1989; Raina et al., 2009; Krizhevsky, 2010), and improved optimizers (Martens, 2010; Le et al., 2011)) have enabled scaling by algorithmic advan-tage, another main approach has been to achieve scale Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 35 / 55

57. Overview Techniques K-Means Clustering http://shiny.rstudio.com/gallery/kmeans-example.html http://mines.humanoriented.com/classes/2010/fall/csci568/ portfolio_exports/mvoget/cluster/cluster.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 36 / 55

58. Overview Techniques Principal Component Analysis(PCA) Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 37 / 55

59. Overview Techniques PCA Application: Population strati

60. cation[6] Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 38 / 55

61. Overview Techniques Tree http://www.reocities.com/cbursk/machine_learning.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 39 / 55

62. Overview Techniques useR 2014 tutorial: Applied Predictive Modelling Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 40 / 55

63. Overview Techniques Bagging, Random forest, Boosting, Ensemble.. øôï Tree ô ä iä. http://www.iis.ee.ic.ac.uk/icvl/iccv09_tutorial.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 41 / 55

64. Overview Techniques Random forest http://www.iis.ee.ic.ac.uk/icvl/iccv09_tutorial.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 42 / 55

65. Overview Techniques Support Vector Machine(SVM) Figure. Example: SVM useR 2014 tutorial: Applied Predictive Modelling Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 43 / 55

66. Overview Techniques Naive Bayes Naive(ô) + Bayes(pt€U`) http://www.saedsayad.com/naive_bayesian.htm Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 44 / 55

67. Overview Techniques Example: Spam mail (1) `penis' è´ ˆD L spam mail| U`@? Assumptions 1 74 T| 30” ¤8T| ! P(spam) = 30 74 2 74 T| 51” `penis' ìh ! P(penis) = 51 74 3 30 ¤8T| 20 'penis' ìh ! P(penis j spam) = 20 30 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 45 / 55

68. Overview Techniques Figure. Example: Naive Bayes (1) https://www.bionicspirit.com/blog/2012/02/09/ howto-build-naive-bayes-classifier.html Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 46 / 55

69. Overview Techniques Example: Spam mail (2) `penis' `viagra' è´ ˆD L spam mail| U`@? Assumptions 1 74 T| 30” ¤8T| ! P(spam) = 30 74 2 74 T| 51” `penis' ìh ! P(penis) = 51 74 3 30 ¤8T| 20 'penis' ìh ! P(penis j spam) = 20 30 4 74 T| 25” `viagra' ìh ! P(viagra) = 25 74 5 30 ¤8T| 24 `viagra' ìh ! P(viagra j spam) = 24 30 6 `penis'@ `viagra' è´X œ@ Å½. ! P(penis viagra) = P(penis) P(viagra) : Naive Bayes Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 47 / 55

70. Overview Techniques Figure. Example: Naive Bayes (2) Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 48 / 55

71. Book Reading Plan Contents 1 Overview Interpretation vs Prediction Types of Machine Learning Techniques 2 Book Reading Plan Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 49 / 55

72. Book Reading Plan Objective 1 ü ‰X” 0ÄYµX PD Là, íYðlÐ ¬©X” µÄ@X (tD ttä. 2 !¨X ììÀ LlÉü É)•D ttä. (Pttì, Not Ý) 3 RD t©Xì ä !¨D Xà Éä. 4 °¬ ðlät à ˆ” ) pt0( ´ôìh)D t©Xì ÈÑ!¨D Xà UÄ| É. Supervised Machine LearningD .. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 50 / 55

73. Book Reading Plan Book 1 An introduction to statistical learning http://www-bcf.usc.edu/~gareth/ISL : ISL[3] 2 Applied Predictive Modelling http://appliedpredictivemodeling.com : APM[4] Coursera Machine Learning https://class.coursera.org/ml-006 Practical Machine Learning https://class.coursera.org/predmachlearn-003 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 51 / 55

74. Book Reading Plan Plan 1 7/16 : Introduction (x ¬ð, t VS !) - @Ä- 2 7/23 : Data pre-processing APM 3 3 8/6 : Over

75. tting Model tuning, sampling method ISL chap 5 APM chap 4 4 8/13 : Linear Model Selection and Regularization Ridge, Lasso ñ.. penalized regression ì. ISL chap 6(6.2) APM 6.4, 12.5 t8| Ì ñ 5 8/20 Tree (ISL chap 8 APM 8, 14) Support vector machine (ISL chap 9 APM 7.3, 13.4) 6 8/27 : Unsupervised learning (ISL chap 10) PCA Clustering Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 52 / 55

76. Book Reading Plan Recommendation Presenter P EX ´© µiXì.. R äµ caret package ”œ Œ EÐ ˆ” (P E ¨P ˆL, H˜tÀ 8p) ´ ?? Training data ! Accuracy ! Testing data ! Accurary LÀ.. Listener 2 books R R studio http://www.rstudio.com/ Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 53 / 55

77. Book Reading Plan Reference I [1] Chiang, T. (2000). Catching crumbs from the table. Nature, 405(6786):517{517. [2] Coates, A., Huval, B., Wang, T., Wu, D., Catanzaro, B., and Andrew, N. (2013). Deep learning with cots hpc systems. In Proceedings of The 30th International Conference on Machine Learning, pages 1337{1345. [3] James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning. Springer. [4] Kuhn, M. and Johnson, K. (2013). Applied predictive modeling. Springer. [5] Le, Q. V. (2013). Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pages 8595{8598. IEEE. [6] Lu, D. and Xu, S. (2013). Principal component analysis reveals the 1000 genomes project does not suciently cover the human genetic diversity in asia. Frontiers in genetics, 4. [7] Maltarollo, V. G., Honorio, K. M., and da Silva, A. B. F. (2013). Applications of arti

78. cial neural networks in chemical problems. [8] Shah, A. R., Oehmen, C. S., and Webb-Robertson, B.-J. (2008). Svm-hustle|an iterative semi-supervised machine learning approach for pairwise protein remote homology detection. Bioinformatics, 24(6):783{790. Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 54 / 55

79. Book Reading Plan END Email : secondmath85@gmail.com Oce: (02)880-2743 H.P: 010-9192-5385 Jinseob Kim (GSPH, SNU) Machine Learning: Introduction October 18, 2014 55 / 55

Machine Learning Introduction

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Machine Learning Introduction

Ähnlich wie Machine Learning Introduction (20)

Mehr von Jinseob Kim

Mehr von Jinseob Kim (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Machine Learning Introduction