'구글'에 해당되는 글 9건
- 2008/07/05
- 2008/06/17
-
2008/06/14
[뉴스]구글 기업문화 (1)
- 2008/06/02
- 2008/05/27
- 2008/05/18
- 2008/05/17
- 2008/04/25
- 2008/04/20

What can the plan be with Microsoft's purchase of hot startup Powerset? The 3-year old company, founded by Dr Barney Pell, recently launched a semantic search experience for Wikipedia.
It is doubtful that Microsoft bought the company just to enhance Live Search. Possibly the plan is to replicate the Wikipedia solution, then incorporate Powerset into Internet Explorer. In this post we look at what the thinking behind the acquisition might be.
Most initial reviews found the Powerset product release underwhelming. Critics appreciated the innovative semantic UI and recognized its potential, but believed it didn't vastly improve Wikipedia. So in view of the lukewarm reviews, the acquisition by Microsoft was unexpected. The 100M price tag is around 5x the 12M Series A + 8M investment put into the company. Microsoft execs must believe Powerset can be a weapon in its battle with Google.
Given a set of unstructured information, Powerset applies Natural Language Processing techniques to extract concepts and the key semantic concepts out of the text. It then builds a semantic index (similar to Google's) as well as a conceptual graph of relationships between entities. This graph is typically expressed in RDF triples.

One of the Powerset innovations is surfacing of semantics to the user interface. The contextual gadget is overlaid to help navigate the unstructured information.
Many thought Powerset to be a generic semantic search engine, but its first product is limited to Wikipedia. It is not trivial to scale the technology to the entire web.
When semantic technologies emerged a few years ago, people started talking about how semantic web and/or semantic search might be a Google killer. The talk was supported by logic that semantic search can deliver more relevant results because it "knows" the content.
Industry realizes that isn't the case. Semantic search has no huge advantage over the statistical approach used by Google. We discussed this in the post Semantic Search - Myth and Reality.
What is powerful about Powerset? Precisely that it doesn't try to search the web as a whole. Right now, the solution works on Wikipedia, but the infrastructure is generic, so any other site could also be enhanced. The contextual outline developed can be used to navigate any content.
Instead of dealing with the whole web, the idea may be firstly to build solutions for specific sites.

Powerset as it is today is no Google killer. At this point only something with huge traction and momentum would stand a chance.
In the search market, Google has a strong hold - potentially stronger if the Yahoo deal goes through. People are conditioned to Google: it's simple and, yes, imperfect, but it's good enough and the results are still better than Live Search.
If Microsoft bought Powerset with the goal to incorporate it into Live Search, then it's likely to be another acquisition to make little impact on the bottom line. In fact, the announcement on the Live Search blog states just that. The number one reason is acquiring talent; the second is the belief that NLP and semantic algorithms will be able to patch holes in today's search.
Today Powerset brings only interesting technology; it doesn't bring traction. So what were they thinking up in Redmond? There may be more subtle play, leveraging the fact Powerset works well on knowledge sets like Wikipedia.
Possibly Microsoft plans to deploy Powerset across its own sites, then perhaps incorporate Powerset into Internet Explorer.
Imagine going to Wikipedia and having a semantic overlay on each page. Now imagine scaling this experience across major information sources around the web.
Providing contextual, semantic experience allows Microsoft to retain eyes longer, shaving off the time people spend searching Google.
This is an important point because Google doesn't make money on search - it makes money on advertising.
The real problem Microsoft is seeking to solve is advertising. Until now the web has figured out two fundamentals for advertising - portals and search.
Portals show ads on each page; the more people browse the content, the more ads are shown and the more money is made. The search model emerged as an alternative, now more successful, path to advertising dollars.
With Powerset and other semantic technologies, there's another model: contextual information exploration overlaid on existing content.
If Microsoft can figure how to keep eyes off Google's home page, the game will shift dramatically. The browser is one of Microsoft's most powerful tools - and the default box is Live Search.
If Microsoft wants to win over advertisers, it might just do more with the browser. Incorporating aspects of Powerset's semantic navigator into the browser by default could be a game changer. This is not a straightforward play. A large company with bureaucracy and execution problems is unlikely to be able to merge semantics into the browser quickly and elegantly.
The Powerset acquisition is an interesting move by Microsoft. This hot semantic startup was on everyone's radar.
What can the plan be? It is doubtful that Microsoft bought the company just to enhance Live Search. Possibly the plan is to replicate the Wikipedia solution, then incorporate Powerset into Internet Explorer.
That is a bold play requiring exact execution - not the kind Redmond has shown lately.
What do you think Microsoft is going to do with Powerset? What are the other applications of this technology that you can think of?
| 구글, 미래를 '클라우드컴퓨팅'에 걸다 Mike Ricciuti, 정리=박효정 기자 2008/06/16 |
|
구글은 기업 트렌드가 ‘클라우드컴퓨팅’이라고 보고 있다. ‘클라우드컴퓨팅’은 기업 유저가 원격 데이터센터의 웹 애플리케이션을 이용해 IT 운용비용을 절감하는 것을 뜻한다. 구글의 리시 찬드라 엔터프라이즈 제품 담당 매니저는 “다음 10년의 혁신은 클라우드로 일어날 것”이라며 “엔터프라이즈 소프트웨어가 사라지진 않겠지만, 분명 변화가 일어나고 있다”고 주장했다. 지난9일~12일(현지시간) 보스턴에서 열린 ‘엔터프라이즈 2.0’ 컨퍼런스에서 강연자로 나선 찬드라 매니저는 구글이 앞으로 보다 많은 기업 고객을 획득하려는 이유에 대해, 가장 중요한 점은 개인 유저 시장에서의 구글의 강점이며 장래는 비즈니스 컴퓨팅에서의 거점이 될 것이라고 강조했다. 그는 “클라우드의 시대가 도래하고 있다. 문제는 언제가 될지가 아니라 얼마나 빨리 올까이다”라며 “구글은 ‘구글앱스’로 그에 대응하고 있다”고 말했다. 찬드라 매니저는 마이크로소프트(MS), 아마존닷컴, 세일즈포스닷컴 등의 대기업이 기존 업무용 시스템과 같은 신뢰성과 안전성을 보유한 비즈니스 애플리케이션을 웹에서 제공하는 시장으로 모여들고 있다고 말했다. 찬드라 매니저는 MS와의 경쟁을 중요시하지 않았다. 그는 “MS와는 경쟁관계지만, 경쟁할 생각은 없다. 구글은 새로운 애플리케이션 사용법을 시장에 투입하려고 있고, 최종사용자에 초점을 맞추고 있다”고 강조했다. 물론 MS에도 클라우드를 바탕으로 한 독자적인 계획이 있다. 레이 오지 수석 소프트웨어 아키텍트가 가장 걱정하고 있는 것은 오픈소스와 구글의 야망이다. MS는 ‘라이브메시(Live Mesh)’로 클라우드컴퓨팅 계획 일부에 도착했다. 올해 안에 더 자세한 내용이 드러날 것이다. 찬드라 매니저는 업계의 4가지 트렌드가 구글의 강점을 지지하고 있다고 말했다. 첫 번째는 구글은 기술 혁신을 추진하고 있는 것은 개인 유저 시장이라고 보고 있다는 점이다. 개인 유저의 세계는 엔터프라이즈의 세계보다 다윈의 ‘진화론’에 따른 세계다. 유저는 뒤떨어진 제품을 참지 않는다. 찬드라 매니저는 “개인 유저의 세계에서는 환승 비용이 없다. 개인 유저 세계의 테스터 수억명이 엔터프라이즈 시장을 돕고 있다. 따라서 개인 유저는 엔터프라이즈 세계보다 뛰어난 기술을 손에 넣고 있다. 인스턴트 메시징(IM), 검색, VoIP는 모두 개인 유저 세계에 바탕한다”라고 설명했다. 그는 이어 구글은 개인 유저 시장에서 다양한 일을 배웠다고 말했다. “단순함이 이긴다. 개인 유저 시장에서 태어난 기술이 머지않아 엔터프라이즈 시장의 기술이 된다”고 강조했다. 두 번째 트렌드는 사내에서의 ‘파워 협력자(공동 작업자)’의 대두라고 찬드라 매니저는 말했다. 그는 “엔터프라이즈 소프트웨어는 전문가에 의해서, 전문가를 위해 만들어져 있다. 그러나 팀별 업무가 늘어나고 있는 지금, 신세대 직원들에 대응해 처음부터 다시 생각할 필요가 있다. 어떤 OS가 사용되고 있는지, 어디서 일하는지가 문제가 돼선 안된다. 소프트웨어는 ‘개방 표준’을 바탕으로 한다. 이것이 클라우드컴퓨팅의 비전이며, 구글이 클라우드야말로 차세대 엔터프라이즈 컴퓨팅의 비전이라고 생각하는 이유”라고 말했다. 또 엔터프라이즈 컴퓨팅의 경제적 측면도 변화하고 있다고 찬드라 매니저는 지적했다. 기업은 증가하는 콘텐츠·동영상·사진 처리 때문에 골머리를 앓고 있다. 찬드라 매니저는 하루 700만매 사진을 처리하는 구글의 사진 공유 사이트 ‘피카사’를 예를 들었다. “클라우드로 구글이 시장과 공유할 수 있는 큰 이점이 있다. 구글의 ‘앱엔진’은 기본적으로는 측정할 수 있는 호스팅 플랫폼이며, 거의 무한한 공간을 제공한다. 기회가 매우 크다”는 것이다. 마지막으로 찬드라 매니저는 엔터프라이즈의 클라우드컴퓨팅 도입 장해가 없어지고 있다고 말했다. 그는 기업의 큰 관심사로서 신뢰성을 들었다. 그는 “지금 구글은 멈출 수 없다. 구글이 멈추면 유저는 구글로부터 떨어져 버린다. 그래서 구글은 클라우드컴퓨팅에 투자하고 있는 것”이라고 말했다. |
[지디넷코리아] 세계 브랜드 가치 1위에 군림하는 구글의 직원은 사내에서 어떤 일상을 보내고 있을까.
일본 구글의 후지시마 이사무조 소프트웨어 엔지니어에 따르면, 문화를 소중히 하는 회사이기 때문에 일상생활에도 그 문화가 많이 반영된다.
후지시마 엔지니어는 구글의 문화를 아래의 9개 키워드로 정리했다.
1. Clarify:명료성
판단 기준은 무엇인지, 과정과 결과는 어떤 것인지 모든 것을 분명히 하는 기준이다.
2. Transparency:투명성
업무에 필요한 정보에는 누구나 접근할 수 있다.
3. Democracy:민주주의
‘톱다운’이 아닌, 전직원의 뜻에 근거해 판단한다.
이런 실례가 있다. 구글 초기 규모가 커지면서 새로운 사무실로 옮겨야 했을 때 위치를 결정한 것은 중역의 의견도 아니고 단순한 다수결도 아니었다. 벽에 큰 지도를 붙이고 그 위에 모든 임직원의 거주지를 핀으로 표시한 뒤 그를 근거로 새 사무실을 결정했던 것이다.
물론 종이 지도를 벽에 붙인 것은 옛 이야기로, 최근 뉴욕 사무실이 이사했을 때는 ‘구글맵스’의 API를 사용해 핀을 표시했다고 한다.
4. Facilitate:촉진
근무시간 중 20%의 시간을 개인적으로 보내는 대신 전직원에게 도움이 되도록 사용하게 한다. 신입사원을 위한 교육제도도 있다.
5. Respect:존경
누구나 좋은 아이디어를 가지고 있다고 믿는다. 회의가 격렬해져도 개인(인신) 공격은 금지.
6. Initiate:스스로 움직인다
엔지니어는 문제가 있으면 스스로 움직여 고친다. 책임감을 갖고 실행한다.
7. Iterate:반복
완벽하게 되기를 기다리는 것이 아니라, 우선 시험해 보고 그 결과에 대해서 유연하고 신속히 처치를 강구한다.
8. Scrappy:부스러기
지금 사용할 수 있는 것을 사용하고 목표를 실현한다. 비효율적인 일이 있으면 효율적 실현 방법을 생각한다.
9. Party:주위 사람을 즐겁게 한다
축하할 것이 있으면 축하한다. 눈에 띄지 않는 작은 일을 하는 사람에게도 칭찬한다. 적극적으로 일한다.
![]() |
| ◇사진설명: 구글의 후지시마 이사무조 소프트웨어 엔지니어. |
■ 구글의 개발 체제
구글의 개발 체제는 전세계에 흩어진 모든 사무실의 모든 개발 조직이 서로 대등하다는 것이 골자다. 본사인 마운틴뷰가 특별히 훌륭하다는 의식이 없다.
구글이라는 세계적 규모의 엔지니어링 팀에 가장 다니기 편한 사무실을 골라 다닌다는 개념이다. 그 때문에 여러 사무실에 프로젝트 팀원들이 분산해 있다.
이런 방식에서 문제가 되는 것은 커뮤니케이션이지만, 구글은 이메일·채팅·비디오·전화·출장·위키(Wiki)·구글독스&스프레드시트·블로그 등 모든 커뮤니케이션 툴을 사용해 대응한다. 하지만 이렇게 하더라도 시차 문제는 여전히 고민거리이긴 하다.
구글의 프로젝트는 커널·컴파일러·툴·미들웨어·시스템·애플리케이션·UI 등 매우 다방면에 걸쳐 있지만, 이것들은 모두 조직 구조 바닥에서 태어난 아이디어가 바탕이다.
구글 직원은 아이디어가 생각나면, 우선은 통상 업무 이외의 작업에 20%의 시간을 할애할 수 있는 사내 제도 ‘20%룰’을 이용해 개인 프로젝트를 시작한다. 그것이 사내에서 인정되면 메인 프로젝트로 변신한다.
이러한 프로젝트 개발은 소인원으로 실시한다. 팀원 구성은 개방되어 있다. “내 일 외에 그 프로젝트에 참여하고 싶다”라고 신청하면 받아들여지기도 한다.
소프트웨어 엔지니어의 일은 ‘서비스 개발에 관련한 모든 것’이다. 아이디어로부터 디자인·코딩·테스트/디버그·평가/분석·보수/개량 등이다. “필요한 문서는 확실히 만들되 불필요한 문서는 만들지 않는다” “100개 문서가 1개 데모를 당하지 못한다”라는 모토로 개발한다.
■ 개발 이외의 일도 하는 개발자
소프트웨어 엔지니어는 기본적으로 개발에 집중하지만, 3가지만 예외적으로 해야 할 일이 있다. 그중 하나는 채용 활동이다.
구글의 엔지니어는 직원 소개를 통해 채용되는 비율이 매우 높다.
우수한 인재를 확보하기 위해서는 엔지니어의 협력이 필요하다는 것이 구글의 전략이다. 엔지니어가 면접에 참여해 컴퓨터 사이언스의 기초지식 및 화이트보드를 사용한 코딩 등을 체크하거나 구글의 사풍에 맞을지를 판별하는 것이다.
이러한 채용 활동에는 시간을 빼앗기는 부작용도 있지만, 구글에서 함께 일하고 싶은 사람을 뽑을 수 있다는 점과 면접을 통해 자신도 공부가 된다는 이점이 있다는 것이 후지시마 씨의 설명.
두 번째는 실적평가다. 구글에서는 분기마다 개인·팀·회사 전체 등 여러 수준에서 목표 설정과 평가를 실시한다. 독특한 것은, 함께 일을 한 사람의 실적을 같은 팀의 엔지니어·타부문 사람·매니저가 서로 평가하는 점이다.
사원 상호의 신뢰 관계가 위태로울 수도 있을 제도이지만, 구글에서는 잘 되고 있다고 한다. 함께 일하고 있는 만큼 서로의 일하는 태도를 잘 알 수 있으므로, 보다 정확한 결과가 나오는 것 외에 이른바 관리자의 눈에 띄기 어려운 일을 담당하는 사람도 정당한 평가를 얻을 수 있다는 것이다.
마지막 일은 ‘노는 것’이다. 앞에서 소개한 9개의 키워드 안에 ‘파티’가 있었던 것처럼 구글에는 놀이와 일을 양립하는 사풍이 있다.
1년에 몇 차례 정도는 사무실을 떠나 놀러 간다. 일본 구글의 경우는 지금까지 리프팅 같은 행사를 열었다. 각종 동호회 활동이 번성한 것은 물론이다.
■ 구글 엔지니어의 하루
후지시마 씨는 하루의 근무시간 대부분을 본업인 코딩에 할애하는 것 외에 ‘테크 토크(Tech Talk)’란 사내 기술 강연회에도 참석한다. 다른 프로젝트의 이야기, 프로젝트 이외의 기술적인 이야기를 듣는 귀중한 기회다. 장래를 위한 아이디어를 내기 위해 당장 눈앞의 일 이외에도 관심을 가질 필요가 있다. 수준이 높은 강연이 많이 열리기 때문에 새로운 발상을 흡수할 수 있다는 것이다.
그런데 후지시마 씨의 구글 입사 동기는 “우수한 사람들과 함께 전세계 사람들을 대상으로 한 소프트웨어 개발을 하고 싶어서”였다고 한다. 그런데 실제로 들어가보면 구글 직원들은 “우수할 뿐만 아니라 인간적으로도 매력적인 사람이 많아서 기쁘고 놀라웠다”는 것이다.
특히 그가 감명을 받은 것은 ‘누가 말한 의견인가가 아니라 그 의견이 가치가 있는지로 평가하는’ 사풍이었다.
이것도 실례가 있다. 예전에 파이톤(Python) 개발자가 보낸 이메일에 갓 대학을 졸업한 신입 엔지니어가 딴지를 걸었다. 그러나 무시되지 않고 내용이 검토되면서 논의로 발전해 갔다. 직함이나 연령이 아닌, ‘정론’ 여부가 가장 중시되는 것이다.
또 구글이라는 회사가 엔지니어를 존중하고 신뢰하는 것에도 감사한다고 후지시마 씨는 말했다.
그는 “엔지니어는 사내 모든 정보에 접근할 수 있다”며 “구글은 좋은 소프트웨어 개발을 위해 필요한 것은 뭐든지 한다. 촌스러운 일도 제대로 한다. 의외였지만 매우 좋은 일”이라고 말했다.
후지시마 씨가 구글에서 일하며 가장 신경쓰는 것은 ‘인간적으로 좋은 동료’를 찾는 것이다. 좋은 동료와 함께 일하는 것, 이러한 생각은 모든 직장인에게 공통되는 희망사항일 것이다. @
| 광고 | ||
Associative Search and the Semantic Web: The Next Step Beyond Natural Language Search
Our present day search engines are a poor match for the way that our brains actually think and search for answers. Our brains search associatively along networks of relationships. We search for things that are related to things we know, and things that are related to those things. Our brains not only search along these networks, they sense when networks intersect, and that is how we find things. I call this associative search, because we search along networks of associations between things.
Human memory -- in other words, human search -- is associative. It works by "homing in" on what we are looking for, rather than finding exact matches. Compare this to the the keyword search that is so popular on the Web today and there are obvious differences. Keyword searching provides a very weak form of "homing in" -- by choosing our keywords carefully we can limit the set of things which match. But the problem is we can only find things which contain those literal keywords.
There is no actual use of associations in keyword search, it is just literal matching to keywords. Our brains on the other hand use a much more sophisticated form of "homing in" on answers. Instead of literal matches, our brains look for things things which are associatively connected to things we remember, in order to find what we are ultimately looking for.
For example, consider the case where you cannot remember someone's name. How do you remember it? Usually we start by trying to remember various facts about that person. By doing this our brains then start networking from those facts to other facts and finally to other memories that they intersect. Ultimately through this process of "free association" or "associative memory" we home in on things which eventually trigger a memory of the person's name.
Both forms of search make use of the intersections of sets, but the associative search model is exponentially more powerful because for every additional search term in your query, an entire network of concepts, and relationships between them, is implied. One additional term can result in an entire network of related queries, and when you begin to intersect the different networks that result from multiple terms in the query, you quickly home in on only those results that make sense. In keyword search on the other hand, each additional search term only provides a linear benefit -- there is no exponential amplification using networks.
Keyword search is a very weak approximation of associative search because there really is no concept of a relationship at all. By entering keywords into a search engine like Google we are simulating an associative search, but without the real power of actual relationships between things to help us. Google does not know how various concepts are related and it doesn't take that into account when helping us find things. Instead, Google just looks for documents that contain exact matches to the terms we are looking for and weights them statistically. It makes some use of relationships between Web pages to rank the results, but it does not actually search along relationships to find new results.
Basically the problem today is that Google does not work the way our brains think. This difference creates an inefficiency for searchers: We have to do the work of translating our associative way of thinking into "keywordese" that is likely to return results we want. Often this requires a bit of trial and error and reiteration of our searches before we get result sets that match our needs.
A recently proposed solution to the problem of "keywordese" is natural language search (or NLP search), such as what is being proposed by companies like Powerset and Hakia. Natural language search engines are slightly closer to the way we actually think because they at least attempt to understand ordinary language instead of requiring keywords. You can ask a question and get answers to that question that make sense.
Natural language search engines are able to understand the language of a query and the language in the result documents in order to make a better match between the question and potential answers. But this is still not true associative search. Although these systems bear a closer resemblance to the way we think, they still do not actually leverage the power of networks -- they are still not as powerful as associative search.
A natural language search can understand the meaning of a query like "books about Harry Potter" and it knows this is not the same as "Books by Harry Potter." But ultimately what is happening is that a linguistic expression is being converted into a more sophisticated keyword search. The language in the query is being mapped to documents that contain text that answers a question, or to data objects that match the thing being asked for. This is certainly better than keyword search but it is still ultimately just a smarter form of literal matching. It is not really making use of associative search along networks of semantic relationships in the data (other than linguistic relationships between words in the query) or any sort of sophisticated reasoning.
By comparison, associative search doesn't merely understand the meaning of the query, it understands and can reason about relationships in the data. This is an important distinction.
An associative search returns documents that represent things that are related, via various forms of associations (semantic links), to the things in the query. An associative search looks through a network of associations for the things that are most connected to the items in the query. By specifying more specific starting points, the set of things which are connected to all those starting points is narrowed. Thus an associative search is an intersection of multiple networks. The items that are most strongly intersected are the results that are most likely to matter.
Associative search is a very different approach to search from keyword search (which merely looks for things with the keywords in them) and natural language search (which merely looks for things that contain content that matches the meaning of the question). It also happens to be more similar to how our brains actually think.
On its own, associative search represents an important advance in the way we search. But by adding some simple reasoning to an associative search it becomes even more powerful. Reasoning adds the ability to generalize or get more specific, and to weight various paths through the network of relationships in more sophisticated ways, such as based on logical relationships or inferences through the network.
A simple example of reasoning is transitivity -- for example, if A is a part of B and B is a part of C, then A is a part of C. If we know that the "part of" relationship is transitive, then whenever we see chains of "part of" links between things we can make transitive inferences. In an associative search these inferences are quite useful. For example, we can search for all the parts of a 747 jet. Using transitive reasoning along networks of relationships we can find all the parts, even those things that are "parts of parts." Similarly we could find "all products of Sony" including products of subsidiaries and business units of Sony. Transitive inferences across transitive links is just one type of reasoning; there are many other variations that are possible, which when combined together become even more useful.
Our current search tools -- whether they are keyword based or natural language based do not support true associative search, let alone reasoning. But we do see associative search starting to appear in a very different breed of application: social networks. A search in LinkedIn for example, is an associative search. Will social networks do an end-run around traditional search engines to provide the next-generation of search? It's quite possible. Facebook and LinkedIn are far better positioned than Google today for associative search. In fact, I would venture that this is how Facebook could give Google some serious competition. But they have to hurry if they are going to do this -- Google has clearly realized the power of "social search" and is rapidly moving to leverage it in their own search results.
Ultimately associative search is more than just social search however. To be really effective, associative search engines need to understand and leverage the full spectrum of relationships between things, not just social relationships. They need to see and understand more types of relationships between more types of things. In order to accomplish this, associative search engines need the Semantic Web.
The Semantic Web provides exactly what is needed to enable associative search, with reasoning, on the Web-at-large. Using RDF and OWL, content can be marked up with metadata that specifies not only its intended meaning and structure, but also the various kinds of semantic relationships it has to other content and to other concepts. In other words, these standards provide a way to add a new network of semantically defined associations to the data on the Web. For example a document about Microsoft can be linked to the concepts "Software Company," "Software Manufacturer," and "Redmond." It can also be linked to a data record that represents "Microsoft" and the properties that define it as a company. The "Microsoft" object can then link to companies that are "suppliers" and "customers" and "competitors" as well as to things which are connected as "products" or "services."
This rich network of relationships between things goes far beyond documents. It contains relationships to people, places, other organizations, products, events, services, etc. It's similar to a social network, but instead of just containing people and social relationships, it contains more types of things and relationships between them. This is really what the Semantic Web enables. One can imagine that as this new semantic data becomes visible on the Web (which is rapidly happening in fact), the power of search will be dramatically improved. Associative search is coming soon to a Web near you!
With that in mind, here is an example of how Semantic Web enabled associative search will work in the future.
PROBLEM: I am trying to remember name of the organizer of a conference I once attended.
WHAT I ALREADY KNOW:
l I know this person and have corresponded with them in the past.
l The conference was related to government and the Internet.
l It took place in a town near
l The organizer of the conference once introduced me to a male celebrity, but I can't remember the celebrity's name.
l I gave a talk at the Conference about Web 3.0.
l My friend, Sue Smith, also spoke at the conference.
l The conference I attended took place in the Spring, but I am not sure if it was last year or two years ago.
In the above example, I cannot remember the specific keywords that will help me generate a query to find the answer. Instead, I remember a number of relationships and generalizations about the answer. Present day search engines cannot see these relationships, and they have no ability to understand a generalization and look at things it contains.
The ability to intersect the sets formed by relationships and generalizations is a fundamental feature of human memory and search. But our present day tools don't have these capabilities. Thus we have to spend time translating our questions into keywordese, rather than just asking our questions in the actual language of human thought.
There are two ways to approach solving this.
The first way is to create artificial intelligence which, given a question in natural language English, can understand it and reason about the question as well as understand and reason about the information in the set of documents being searched, in order to intelligently arrive at candidate answers. This is computationally intensive, and very hard to program. This is why AI hasn't quite happened yet on this scale.
A perhaps easier approach is to use the Semantic Web. In the Semantic Web approach, metadata is embedded into content that describes the meaning of the content, it's various important properties, and its relationships to other concepts. On the basis of this metadata, the problem becomes much simpler to solve. Instead of doing high-level AI it becomes essentially a statistical search.
Now let's look at how using the Semantic Web could help us solve the above problem via an associative search:
Items are connected to more general or specific concepts by virtue of semantic linkages between concepts. For example, the conference I am looking for is related to the concepts "Government" and "Technology." If I can at least remember that then I can find conferences related to government and technology. Furthermore, since the concept "Policy" is a subset of government it may be related to that topic as well.
Likewise, things are connected to things that are "near" them via geographic links. Because the conference was near
The organizer of the conference introduced me to a male celebrity. There are several celebrities in my social network. If the fact that I met certain people via introductions from other people was stored using semantic links, then this too would be searchable. For example, "find all celebrities I was introduced to by my connections" would be a solvable query. Similarly, "find people who introduced me to celebrities" would also be solvable.
The fact that I gave a talk at the conference could also be semantically represented on a data record describing the conference, as well as on my own profile. Thus there could exist a link such as "speaker at" which links me to various conferences I have spoken at. I could then get a list of all the conference I have spoken at. I could also look for all the conferences where both myself and Sue Smith were speakers.
Or, better yet, there could be a link called "Gave talk about" which links me to an instance describing each talk I have given. From such an instance there could then be "Gave talk at" links to all the events where I have given that talk. So I could look up my "Web 3.0" talk and then see all the conferences where I gave that talk.
Temporal relations can also be generalized and semantically represented. For example, the conference I am looking for took place in the spring. Therefore only look for conferences that took place in or near months that are considered to be in the spring season.
By intersecting the results of the above searches we narrow down very precisely to a set of people I might be looking for, or just to a single qualifying person.
For example the answer I was seeking for was that the organizer was named Robert Jones, and the conference was about Government and Technology Policy in Carmel-by-the-Sea last spring.
This result should be easily findable via associative search starting from the above set of things I remember. But if for some reason the answer is still not there, there is another capability which the brain uses that we need to add to our search engines:
1단계 : Perturbation, or what could be called "prospecting."
Perturbation = 섭동 : 수학에서 해가 알려진 다른 비슷한 문제와 비교하여 문제를 푸는 방법. 보통 이 방법으로 구한 해는 근사값일 뿐이다.
The query I entered is comprised of a question and a set of facts related to the answer I am seeking. But there is a possiblity that I asked the question incorrectly, or some of the facts I added were incorrect, or insufficient. Perturbation can correct for this by introducing variations into the question and the facts in order to explore the space of answers that are "near" them as well.
There are many ways to go about adding perturbation to the system -- for example, we can search more than one hop out from every link, or we can search for other types of relationships that are highly correlated with relationships we are asking for explicitly, or we can include results for things which are strongly connected to things that are found.
From a user-interface standpoint perturbation can be controlled with a simple "sliding lever" in the user interface for "Precision." If the user sets very high Precision as a requirement then there is no perturbation -- the results are exact matches to the query and facts. If there is low Precision as a requirement then there can be more perturbation, thus the results are fuzzy and may include things that are near what I asked for but not exactly what I specified, enabling me to discover things via relevant relationships that I could not even remember to mention as facts.
2단계 : Reasoner