Google Scholar and the Academic Web (November 2013) slides. Delivered as part of the Durham University Researcher Development Programme. Further Training available at https://www.dur.ac.uk/library/research/training/
4. Intelligent Web Searching
• What are you looking for?
– Breadth or precision
– Single document or comprehensive coverage
5. Intelligent Web Searching
• What are you looking for?
– Breadth or precision
– Single document or comprehensive coverage
• How are you searching?
– Targeted searching
• Combining terms = narrow search; AND is assumed
• OR, “phrase”, -not, ˜synonym, intitle:, site:.ac.uk,
date:months
– Evaluating results
9. Getting the most from Google
How to search effectively:
Tsunami defences
assumed „AND‟ returns results with both terms
Property –intellectual
excludes all results that include „intellectual‟
Butterfly OR lepidoptera
searches for either of your search terms
10. Getting the most from Google
How to search effectively:
“early warning system”
returns results with exact phrase
intitle:endochronology
returns results with term in document title
site:.gov.uk
only returns results from specific site/domain
~ghosts
returns related terms, eg paranormal, haunted
12. Google Scholar
• Scholarly literature
• Articles, theses, books, abstracts or court
opinions
• Advanced features
Citations, grouped articles, related articles,
alerts, set up ConneXions off campus, links
to Endnote downloads
Google Scholar
23. Advantages over library
databases
• Broader range of resource types e.g.
books, journal articles, theses
• Information from range of sources
e.g. databases, publishers, OA
repositories
24. Advantages over library
databases
• Broader range of resource types e.g.
books, journal articles, theses
• Information from range of sources
e.g. databases, publishers, OA
repositories
• Simple to search
27. Disadvantages
• Too many results(?)
• Less quality control
• Coverage: Doesn‟t index all publisher content
28. Disadvantages
• Too many results(?)
• Less quality control
• Coverage: Doesn‟t index all publisher content
• Inconsistent level of bibliographic information
29. Disadvantages
• Too many results(?)
• Less quality control
• Coverage: Doesn‟t index all publisher content
• Inconsistent level of bibliographic information
• Some non-academic document types e.g.
Handbooks
30. Disadvantages
• Too many results(?)
• Less quality control
• Coverage: Doesn‟t index all publisher content
• Inconsistent level of bibliographic information
• Some non-academic document types e.g.
handbooks
• Less developed search options and reduced
ability to limit searches
34. Google Scholar
• Track citations to your publications
– Check who is citing your publications. Graph your
citations over time. Compute citation metrics.
• View publications by colleagues
– Keep up with their work. See their citation metrics.
• Appear in Google Scholar search results
– Create a public profile that can appear in Google
Scholar when someone searches for your name.
39. Academic Resources
• Books
– Google Books, Project Gutenberg, Universal
Library, (Access to full text of previews)
– COPAC, WorldCat (Identify books in other
libraries)
• Journal ToCs
– ZETOC, JournalTOCs, ticTOCs, My Favourite
Journals ,
40. Academic Resources
Open Access and repositories
• Institutional: DRO, Durham e-Theses, LSE Online
• Subject specific: ArXiv, RePeC, SSRN, Pubmed
• Use OpenDOAR or Google Scholar
41. Social / Academic Resources
Make use of what others are already collecting:• CituLike – search at http://www.citeulike.org/
• Delicious – search Google ( site:.delicious.com)
• Scoop.it – search at http://www.scoop.it/
• Twitter
50. The Hidden Web
• Search engines can access only about 16% of
the available information on the WWW.
• Many library databases are not indexed by
Google Scholar and other search engines.
• If they are, they may not be very visible.
Library web pages
51. Image Credits
[Slide 3] Via Flickr Creative Commons, by Stefan. Original available at
http://www.flickr.com/photos/49462908@N00/3951143570
[Slide 11] Via Flickr Creative Commons, by David Goehring. Original
available at http://www.flickr.com/photos/15923063@N00/143186839
[Slide 32] Via Flickr Creative Commons, by Alexandre Duret-Lutz, available
at http://www.flickr.com/photos/24183489@N00/320300354
[Slide 38] Via Flickr Creative Commons, by jrgcastro. Original available at
http://www.flickr.com/photos/19939966@N00/2618893334
[Slide 43] Via Flickr Creative Commons, by GuidosPortaal. Original
available at http://www.flickr.com/photos/38239176@N04/3843484756/
[Slide 49] Via Flickr Creative Commons, by FutUndBeidl. Original available
at http://www.flickr.com/photos/61423903@N06/7557181168
Welcome.Main focus is on highlighting some aspects of Google Scholar and focusing on how to get the most out of it by setting up your preferences as a member of Durham University.This is in the main aimed at those with little or no experience of using Google Scholar, but hopefully there will be something new for everyone.There is also part of the session highlighting some other resources which you may or may not be aware of.
QUESTION: Anyone used Google Scholar before?Just to get an idea of where we stand with this session. Layout of session. Latter part of session will look at some free sources available from the wider web which aren’t all included in a Google Scholar search.
Getting the most out of Google.Example in image: if you get the reference, it is far more likely that the little storm troopers will got lots of information telling them about what droids they are not looking for.Google Scholar does cope better with typing in a sentence or question than many academic databases, but as professional researchers it is not the best way of searching and you should be trying more appropriate techniques.
ASK QUESTIONS FIRSTThese key questions are partly about effective web searching but also relevant to any database.Para 0: Ask attendees (very basic question):Why do you search the web? [vastness, free, convenient]2) What do you commonly use? [Google]Para 1: Ask attendees (very basic question):3) Do you favour breadth over precision? [Reassurance]4)Do you refine down an initial large set of results? Or are you happy to navigate large sets of results?5) Do you start without knowing precisely the object of their search? Aim to find whatever may prove significant to their research? - all articles, specific article (e.g. all local cinema listings, specific cinema)Para 3: Ask attendees (very basic question):6) Do you just type in a question, or do you try and identify keywords and different search techniques…7) Do you use the results you have already found to try new combinations of search terms, refocus your search etc?
ASK QUESTIONS FIRSTThese key questions are partly about effective web searching but also relevant to any database.Para 0: Ask attendees (very basic question):Why do you search the web? [vastness, free, convenient]2) What do you commonly use? [Google]Para 1: Ask attendees (very basic question):3) Do you favour breadth over precision? [Reassurance]4)Do you refine down an initial large set of results? Or are you happy to navigate large sets of results?5) Do you start without knowing precisely the object of their search? Aim to find whatever may prove significant to their research? - all articles, specific article (e.g. all local cinema listings, specific cinema)Para 3: Ask attendees (very basic question):6) Do you just type in a question, or do you try and identify keywords and different search techniques…7) Do you use the results you have already found to try new combinations of search terms, refocus your search etc?
Reason 1 for not typing in a sentence – order and proximity of terms affects search results. Inclusion of unnnecessary words and stopwords could potentially affect results unintentionally.
[animations on slide]GOOGLE. Shows use of different search tips to narrow down from over 12 million results, to 8 results.Phrase searching: reduces number of results by 90% (25% in June 2013)Additional keywords: Focusses search, reduces by 40% (75% in June 2013)Exclude results: filter out 99% of resultsLimit to UK academic sites: Filters out 90% of results…. Good indicator of range of content covered by Google (note, not perfect as by doing this are filtering out academic blogs, most news coverage etc.)Faceted search: Searching a particular part of a document to add focus. In title of document (in part reliant on authors following web standards).
I can give you plenty of hints today about searching Google and Google Scholar.You can go and read Google’s search help pages (and then them read them again next month because they may have changed)... But you will never completely be sure exactly what is happening with your search, and why some results are “more relevant” than others.(With many of our academic databases you can’t either, but we might be able to get a better idea when we come to buy/negotiate, so not as secretive or as potentially biased as Google, because Google bases much of its income stream on how its search function is used).
AND – the more search terms you include, logically the fewer the number of results as results have to mention all terms.Reason 2 for not typing in a sentence - stop words – Google ignores many terms in searches you might enter, meaning entering a sentence or question is often just a waste of time typing… can include: HE, SHE, AT, THE, A, ZERO, DESCRIBED, UPWARDS, LEAST, THIRD (refer to handout).NOT – be careful, may remove results which may have been useful and just happened to mention a term in passing, or in reference to something elseOR - broadens your search
Phrase searching – as seen, can have a significant effect. Doesn’t always work (even though Google promote use on their own help pages) but is much better now than was a few years agoIntitle – does rely on authors of web pages coding their site properly, and the term you are searching for being the key focus of the document (and not just one of several focuses) but can massively reduce number of results.Site limitation - I personally find this useful, but if you wish to search different domains/sites I find it easier to run separate searches, especially given different terminology between countries/regions (eg retardation used far more recently in US healthcare terminology than in the UK, whilst homicide is obviously a US term so if limiting to UK sites will be of less use).Synonyms and related terms – useful but difficult to always identify what terms are being searched.
ADVANCED SEARCHINGDEMO: Go to preferences first to make sure fully set up for ConneXions and Endnote. Then display advanced search options (click on down arrow in search box).with all of the words electron with the exact phrase "liquid helium" author: platzmanExplain citations, Come back to ‘versions’ in subsequent slideRelated articles are based on ‘relevance’ and ‘how similar’ i.e. use own algorithm which we’re not party too. Maximum number 101Citation and key word alerts – ‘alert’ in citation, ‘alert’ after searchingConneXions worksEndnote link – change preferences where necessary
Mention surprises you might receive with alerts
Normal alert…
Example in following screenshots
Example in following screenshots
Results in ascending order (Web of Knowledge (34), Chicano/Historical Abstracts on Ebsco (37), Science Direct (1,044), JSTOR (16,137)… Google Scholar… about 124,000 results
Specific example not just KWGS = 47WOS = 12
Reasons for why so many more results…Open Access Repositories – Do you all know what they are?Institutions attempting to keep a draft copy of every article produced by their academics for publication in scholarly journals. The database of these final draft copies is then freely available.Items from these institutional repositories are sometimes ‘harvested’ and grouped together by subject in secondary repositories
Reasons for why so many more results…Open Access Repositories – Do you all know what they are?Institutions attempting to keep a draft copy of every article produced by their academics for publication in scholarly journals. The database of these final draft copies is then freely available.Items from these institutional repositories are sometimes ‘harvested’ and grouped together by subject in secondary repositories
Reasons for why so many more results…Open Access Repositories – Do you all know what they are?Institutions attempting to keep a draft copy of every article produced by their academics for publication in scholarly journals. The database of these final draft copies is then freely available.Items from these institutional repositories are sometimes ‘harvested’ and grouped together by subject in secondary repositories
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Depends what you’re after – may not be too many if doing a broad search or if good with KW searchingDoesn’t index all databases – don’t think you’ve searched everything. Not good for primary sources e.g. newspaper articles (unless books) and only just starting coverage of ETHoS (even though freely available) from August 2012.http://ethostoolkit.cranfield.ac.uk/tiki-read_article.php?articleId=10 Inconsistent level of bibliographic information– sometimes basic citation sometimes more. Also doesn’t highlight theses as theses but books Although not all databases are consistent and abstracts often unavailable for older articles or theses appear as books on library catalogue. Less developed search options and ability to limit searches- Compare ability to refine searches and reorder results in some of the major databases such as WOK – more of an academic setupSee example comparison on following slide…
Benefits – small time frame so stops bias towards older journalsBias to those which publish a lot of review articles as they are more likely to widely citedBias towards Eur and North American – remember only journals in this database and there are few LOTE (languages other than English) in here
Demo 4 – Google Scholar citations and metricsStart at Google Scholar homepage, and point out links at top (My Citations, Metrics) - Click on metrics. Show list of top citing journals, listed by h-index, by language and field of study. Show differences in average citations plus h-index. - Click on Google Scholar to return - Click on My Citations to show log in. - Show around screen - From drop-down menu, select add to - Search for Loughlin in ‘search articles’ - Add article (The calf in Bronze Age Cretan art and society), then go to profile to show… then delete and permanently delete from trash. - Click on Add again and show how you can add references manuallyThen return to slides to show fleshed out version maintained by author.
Books: Google Books (taking over the world), Gutenberg Project, Universal Library, Alex catalogue, Gallica –BibliothequeNationale, ORB – online reference book for mediaeval studiesGoogle Books: sources from two projects: Publishers upload citation info, and direct to purchase site (many with a preview available) / Project to digitise content which has fallen out of copyright and in various public libraries.Zetoc: Need subscription to set up alert, accessible via library website.JournalTOCs, TICTOCs, My favourite journals similar functions (MFJ over 10,000 journals)CiteULike and academic social network for sharing bookmarks, scholarly paper citations etc within broad subject categories.
Repositories http://www.dur.ac.uk/library/resources/online/repositories/.dur.ac.uk/library/resources/online/googlescholar/ Demo OAIster: music AND protest - Rage from within the machine : protest music, social justice, and educational reform, a collective case study
Background... Late 90’s through to late naughties, various sites which tried to “make order” of the web. BUBL, Infomine etc.Social bookmarking services increasingly more prevalent from the mid-naughties - initially as personal, mobile bookmarking services - then as ‘shared bibliography’ services - various sites such as pinterest, diggit, tumblr etc.
‘library history in nineteenth century Britain’Point is that no one search engine will give you the same results – see next slideSuggestion that on the indexable web there are over 25 billion pages on over 100 million sites. Whether these are accurate or not doesn’t matter – the numbers are huge.3% or less of first page results the same across top engines 50-60% of searches convert to a first page click. Issuescom or .co.uk? The number of results that you get will differ on a regular basis. Also sub-Googles e.g. blog, images,Google bombs can influence results e.g. I’m feeling lucky French Military Victories= page mocked up to look like did you mean French Military DefeatsWikipedia is ranked highly
‘library history in nineteenth century Britain’Point is that no one search engine will give you the same results – see next slideSuggestion that on the indexable web there are over 25 billion pages on over 100 million sites. Whether these are accurate or not doesn’t matter – the numbers are huge.3% or less of first page results the same across top engines 50-60% of searches convert to a first page click. Issuescom or .co.uk? The number of results that you get will differ on a regular basis. Also sub-Googles e.g. blog, images,Google bombs can influence results e.g. I’m feeling lucky French Military Victories= page mocked up to look like did you mean French Military DefeatsWikipedia is ranked highly
‘library history in nineteenth century Britain’Point is that no one search engine will give you the same results – see next slideSuggestion that on the indexable web there are over 25 billion pages on over 100 million sites. Whether these are accurate or not doesn’t matter – the numbers are huge.3% or less of first page results the same across top engines 50-60% of searches convert to a first page click. Issuescom or .co.uk? The number of results that you get will differ on a regular basis. Also sub-Googles e.g. blog, images,Google bombs can influence results e.g. I’m feeling lucky French Military Victories= page mocked up to look like did you mean French Military DefeatsWikipedia is ranked highly
‘library history in nineteenth century Britain’Point is that no one search engine will give you the same results – see next slideSuggestion that on the indexable web there are over 25 billion pages on over 100 million sites. Whether these are accurate or not doesn’t matter – the numbers are huge.3% or less of first page results the same across top engines 50-60% of searches convert to a first page click. Issuescom or .co.uk? The number of results that you get will differ on a regular basis. Also sub-Googles e.g. blog, images,Google bombs can influence results e.g. I’m feeling lucky French Military Victories= page mocked up to look like did you mean French Military DefeatsWikipedia is ranked highly
Demo Dogpile… - advanced search - limit domain to .gov.uk - phrase: research councils - all terms: funding research
Hidden web Even the best search engines can access only about 16% of the available information on the World Wide Web. Therefore 84% of the information is excluded. That 84% has become known as the Invisible Web. Invisible Web is 500 times larger than the Surface Web. 95% of the Invisible Web is publicly accessible information. Content found in databases – Database content that is dynamically generated as the result of a query cannot be found by general-purpose search engines. Example: ERIC database, Library catalogs. Subscription database content – Fee-based database content is only accessible to those who have subscribed. (Many libraries offer their members free access to subscription databases.) Examples: EBSCOhost databases, LexisNexis Academic. Information offered on very content rich websites – General-purpose search engines only partially index very large (deep) websites. The parts of the website that they do not index become part of the Invisible Web. Examples: Library of Congress, U. S. Census Bureau. Real time content – Information about events currently taking place may not yet be indexed by general-purpose search engines. Formats – Information occurs in various formats, some of which are not indexed by general-purpose search engines. It also takes time for new formats to appear in search engines. Example: podcastSites requiring login authorization – These sites require users to login or identify themselves as having the right to access and use content. Examples: Blackboard, membership sites. Sites with interactive content – These sites require information from the user e.g. to fill out a form before they can generate an answer. Examples: Travel direction sites, job hunting sites. New content – It may take time for a search engine to find and include new websites and newly added website content. Sites that are not linked to by other sites - Search engines index websites by following links from one website to another, if there aren't any links to a site it might not be found or included. Sites blocked by Robot Exclusion ProtocolsDEMO:Abbot George archbishop Canterbury inGoogle – DNB is indexed, but does not appear high up the search ranking (rarely within first 5-10 pages)GS – DNB is not indexed, Open wikipedia article (first result on Google) – references to DNBUse DNB author Finchamin GS to limit search by author.E-books on catalogue and http://www.dur.ac.uk/library/resources/online/ebooks