oralce教程里的ctxsys.ctx_ddl如何创建

--查看DB中是否存在CTXSYS用戶
select * from all_users& where username='CTXSYS'
--查詢CTXSYS具有的角色,是否包含CTXAPP角色
select grantee,granted_role from dba_role_privs where grantee='CTXSYS'
--激活CTXSYS用戶
alter user CTXSYS account unlock
--創建測試用戶TEXTTEST
create user TEXTTEST identified by TEXTTEST
--修改CTXSYS密碼
alter user CTXSYS identified by CTXSYS
GRANT resource, CONNECT, ctxapp TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_cls TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_ddl TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_doc TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_output TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_query TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_report TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_thes TO TEXTTEST
GRANT EXECUTE ON ctxsys.ctx_ulexer TO TEXTTEST
--設置詞法分析器
ctx_ddl.create_preference ('my_lexer1', 'chinese_vgram_lexer');
--查看系統或設置的oracle text參數
SELECT pre_name, pre_object FROM ctx_preferences
--新建測試表
create table TEXT_SEARCH(
id NUMBER NOT NULL PRIMARY KEY,&
ABSTRACT varchar2(2000));
--插入資料
insert into TEXT_SEARCH values(1,'《HTML5高级程序设计》首先介绍了HTML5的历史背景、新的语义标签及与以往HTML版本相比的根本变化,同时揭示了HTML5背后的设计原理。从第2章起,分别围绕构建令人神往的富Web应用,逐一讨论了HTML5的Canvas、Geolocation、Communication、WebSocket、Forms、Web Workers、Storage等API的使用,辅以直观明了的客户端和服务器端示例代码,让开发人员能够迅速理解和掌握新一代Web标准所涵盖的核心技术。《HTML5高级程序设计》最后探索了离线Web应用并展望了HTML5未来的发展前景。   《HTML5高级程序设计》面向有一定经验的Web应用开发人员,对HTML5及未来Web应用技术发展抱有浓厚兴趣的读者也可以学习参考。');
insert into TEXT_SEARCH values(2,'《软件架构师应该知道的97件事》是本与众不同的技术图书。五十多位作者中不乏像尼尔&&福特(Neal Ford)、迈克尔&&尼加德(Michael Nygard)、比尔&&德&&霍拉(Bill de h6ra)这样杰出的软件架构师,大家分享了多年积累的开发经验和工作准则.内容不限于单纯的技术范畴.还涉及如何与各方沟通、如何降低项目的复杂度、怎样强化开发团队等。');
--SQL& start D:\app\Milo\product\11.2.0\dbhome_1\ctx\admin\defaults\drdefus.
--創建全文索引
CREATE INDEX demo_abstract ON TEXT_SEARCH(ABSTRACT) indextype IS ctxsys.context parameters('lexer my_lexer1');
--查看索引建立是否出錯。
SELECT * FROM ctx_USER_index_errors
--基本測試
SELECT t.* FROM TEXT_SEARCH t WHERE contains(ABSTRACT,'HTML5高级程序设计')&0;
SELECT t.* FROM TEXT_SEARCH t WHERE contains(ABSTRACT,'HTML5高级程序设计 or 不同的技术图书')&0;
--AND 測試
SELECT t.* FROM TEXT_SEARCH t WHERE contains(ABSTRACT,'HTML5高级程序设计 and 历史背景')&0;
阅读(...) 评论()数据库工程师众所周知的一个事实是_解决方案网当前位置:& &&&Oracle全文索引,该如何处理Oracle全文索引,该如何处理本文收集于网络,只用于方便查找方案,感谢源作者,如果侵权请联系删除Oracle全文索引ORACLE 全文索引功能实现学习笔记& 前言: 数据库工程师众所周知的一个事实是,当对数据库里的文本字段进行like检索的时候,任何数据索引都是不起作用的,这样也就导致系统会承担额外的开销和负载压力,对于庞大的数据记录,对其中的文本字段进行关键字匹配,就肯定会存在非常严重的效率障碍和性能障碍。因此,基于文本的全文索引技术也就逐渐兴起。& 全文索引的技术原理并不复杂,对段落性的文本内容进行逐词分解,并针对词出现频率,出现位置进行标记,按照词本身的编码顺序存储为索引文件。这样,在针对关键词进行检索的时候,就不会遍历所有的文本数据记录,而是根据索引文件进行有序查找,这里面一个显见的事实是,通过有序索引查找关键词,对于海量的数据记录而言,也只需要很少次数的指针跳转,(数量为X的索引记录,查询特定记录的指针跳转次数最多为Log2(x)。)即可完成搜索,而无须完整遍历整个数据表或文件集。& 但是全文索引技术的实现却并不简单,针对中文的尤其如此,英文文本中,空格是天然的分词标记,而中文段落却无法通过这样简单的途径分词,因此基于常用语词典和一些语言识别规则的分词技术成为一种非常高的技术门槛,幸好,很多商业公司提供了非常成熟的商业产品,使我等可以坐享其成,快速搭建全文搜索的平台。& ORACLE INTERMEDIA介绍& ORACLE Intermedia是ORACLE公司官方发布的用来管理多媒体数据的数据库管理模块,通过它可以进行有效的视频,音频,图片等文件的统一存储,调用和相关处理;同时其中也包括一个Oracle Intermdedia Text功能模块,能够对多种格式文档进行分词索引处理,也提供了使用自然语法或高级查询方法进行跨文本查询的途径,可以查询word, PDF,RTF等格式的文件和数据。& Oracle Intermedia 的索引效率和查询效率,据一些公开数据上看要远高于Microsoft的Index Server,而且本身具有平台无关特性,另外作为数据库产品,可以很好的和数据库应用进行整合,这一点也是纯粹的文件索引系统所无法实现的。当然,作为通用的数据库产品,Oracle不可能针对全文索引做到最大限度的优化,因此对于高并发大容量的搜索引擎应用,Oracle的方案可能就无法满足,这一点也是必须提前声明的。& 全文索引实现步骤& 步骤1:查看Oracle Intermedia是否正确安装。Oracle Intermdeia是Oracle的一个附带模块,安装过程中选择即可。& 步骤2:设置词法解析器& oracle根据不同语言,有不同的词法解析器,以下说明我们可能用到的三个& basic_lexer,针对英语环境,以空格为分词标记,同时能分辨一些“噪音”单词,如 “if”, “is”等。& chinese_vgram_lexer,专用的汉语分析器,按字为单元分析中文,算法简单,可以一网打尽中文用词,但是效率差强人意。& chinese_lexer,可以识别大部分常用短语和词汇,不会产生大量冗余数据,有很好的实用性,但是语言支持只能为UTF-8编码,不支持zhs16gbk字符集。& 以ctxsys用户登陆系统,执行:& begin ctx_ddl.create_preference('my_lexer','chinese_vgram_lexer');& 这里假设我们的语法解析器命名为my_lexer,这个名称也可以根据实际应用变化。& 步骤3:建立索引字段& 我的测试用例保存在system空间,表名为my_docs,字段名为doc,字段类型为blob,存储标准word doc文件。& 仍旧保持ctxsys帐户登陆,执行如下操作& create index system.myindex on system.my_docs(doc) indextype is ctxsys.context parameters(‘lexer’,’my_lexer’) ;& 步骤4:同步操作(sync)及优化操作& 以system 登陆,同步操作执行& exec ctx_ddl.sync_index('myindex');& 创建同步定时任务代码如下& VARIABLE& BEGIN&   DBMS_JOB.SUBMIT(:jobno,'ctx_ddl.sync_index(''myindex'');',&   SYSDATE, 'SYSDATE + (1/24/4)');&   &   END;& /& 以system登陆,优化索引操作执行& exec ctx_ddl.optimize_index('myindex','FULL');& 创建优化定时任务代码如下& VARIABLE&   BEGIN&   DBMS_JOB.SUBMIT(:jobno,'ctx_ddl.optimize_index(''myindex'',''FULL'');',&   SYSDATE, 'SYSDATE + 1');&   &   END;& /& 步骤5:测试& select id from my_docs where contains(doc,'关键字')&0& 总结:& 该学习笔记内容大部分可以通过搜索引擎找到,并非本人原创内容,本文全部经个人在windows平台下,在oracle 9i下测试完成,留档记录,为日后的项目和产品开发做技术准备。------解决方案--------------------探讨ORACLE 全文索引功能实现学习笔记 前言: 数据库工程师众所周知的一个事实是,当对数据库里的文本字段进行like检索的时候,任何数据索引都是不起作用的,这样也就导致系统会承担额外的开销和负载压力,对于庞大的数据记录,对其中的文本字段进行关键字匹配,就肯定会存在非常严重的效率障碍和性能障碍。因此,基于文本的全文索引技术也就逐渐兴起。 全文索引的技术原理并不复杂,对段落性的文本内容进行……
------解决方案--------------------
当更新文本时,并不会自动更新索引,此时查询不能显示正确的结果,需要同步优化主要是删除索引中多余的信息
------解决方案--------------------DBMS_JOB.SUBMIT(:jobno,'ctx_ddl.optimize_index(''myindex'',''FULL'');',   SYSDATE, 'SYSDATE + 1'); 'SYSDATE + 1' 是多久执行一次
------解决方案--------------------不好意思,工作忙,上网不方便,没空常过来看同步比较重要,如果你修改或新增了一条记录,没有同步索引,则查询结果中不会出现这条记录同步会将新增的term更新到索引中(相关信息可以从DR$myindex$I,DR$myindex$K,DR$myindex$R,DR$myindex$N表中查看,其中myindex为你的索引名称)优化将已经不存在的term从索引中删除。同步比优化重要,一般使用定时任务来进行这两个操作,参考时间间隔为:同步15分钟一次,优化1小时一次详细信息可以查看官方文档Oracle Text Reference/docs/cd/B10501_01/text.920/a96518/preface.htm
------解决方案-------------------- 共&2&页:
上一篇:下一篇:
File: 22:59:52<font face="Arial, Helvetica, sans-serif" color="#
This chapter provides reference information for using the CTX_DDL PL/SQL package to create and manage the preferences, section groups, and stoplists required for Text indexes.
CTX_DDL contains the following stored procedures and functions:
Adds an attribute section to an XML section group. This procedure is useful for defining attributes in XML documents as sections. This allows you to search XML attribute text with the WITHIN operator.
When you use AUTO_SECTION_GROUP, attribute sections are created automatically. Attribute sections created automatically are named in the form tag@attribute.
CTX_DDL.ADD_ATTR_SECTION(
group_name
section_name
varchar2);
group_name
Specify the name of the XML section group. You can add attribute sections only to XML section groups.
section_name
Specify the name of the attribute section. This is the name used for WITHIN queries on the attribute text.
The section name you specify cannot contain the colon (:), comma (,), or dot (.) characters. The section name must also be unique within group_name. Section names are case-insensitive.
Attribute section names can be no more than 64 bytes long.
Specify the name of the attribute in tag@attr form. This parameter is case-sensitive.
Consider an XML file that defines the BOOK tag with a TITLE attribute as follows:
&BOOK TITLE="Tale of Two Cities"&
It was the best of times.
To define the title attribute as an attribute section, create an XML_SECTION_GROUP and define the attribute section as follows:
ctx_ddl_create_section_group('myxmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_attr_section('myxmlgroup', 'booktitle', 'BOOK@TITLE');
When you define the TITLE attribute section as such and index the document set, you can query the XML attribute text as follows:
'Cities within booktitle'
Creates a field section and adds the section to an existing section group. This enables field section searching with the
Field sections are delimited by start and end tags. By default, the text within field sections are indexed as a sub-document separate from the rest of the document.
Unlike zone sections, field sections cannot nest or overlap. As such, field sections are best suited for non-repeating, non-overlapping sections such as TITLE and AUTHOR markup in email- or news-type documents.
Because of how field sections are indexed,
queries on field sections are usually faster than WITHIN queries on zone sections.
CTX_DDL.ADD_FIELD_SECTION(
group_name
section_name
boolean default FALSE
group_name
Specify the name of the section group to which section_name is added. You can add up to 64 field sections to a single section group. Within the same group, section zone names and section field names cannot be the same.
section_name
Specify the name of the section to add to the group_name. You use this name to identify the section in queries. Avoid using names that contain non-alphanumeric characters such as _, since these characters must be escaped in queries. Section names are case-insensitive.
Within the same group, zone section names and field section names cannot be the same. The terms Paragraph and Sentence are reserved for special sections.
Section names need not be unique across tags. You can assign the same section name to more than one tag, making details transparent to searches.
Specify the tag which marks the start of a section. For example, if the tag is &H1&, specify H1. The start tag you specify must be unique within a section group.
If group_name is an HTML_SECTION_GROUP, you can create field sections for the META tag's NAME/CONTENT attribute pairs. To do so, specify tag as meta@namevalue where namevalue is the value of the NAME attribute whose CONTENT attribute is to be indexed as a section. Refer to the example.
Oracle knows what the end tags look like from the group_type parameter you specify when you create the section group.
Specify TRUE to make the text visible within rest of document.
By default the visible flag is FALSE. This means that Oracle indexes the text within field sections as a sub-document separate from the rest of the document. However, you can set the visible flag to TRUE if you want text within the field section to be indexed as part of the enclosing document.
Visible and Invisible Field Sections
The following code defines a section group basicgroup of the BASIC_SECTION_GROUP type. It then creates a field section in basicgroup called Author for the &A& tag. It also sets the visible flag to FALSE:
ctx_ddl.create_section_group('basicgroup', 'BASIC_SECTION_GROUP');
ctx_ddl.add_field_section('basicgroup', 'Author', 'A', FALSE);
Because the Author field section is not visible, to find text within the Author section, you must use the
operator as follows:
'(Martin Luther King) WITHIN Author'
A query of Martin Luther King without the WITHIN operator does not return instances of this term in field sections. If you want to query text within field sections without specifying WITHIN, you must set the visible flag to TRUE when you create the section as follows:
ctx_ddl.add_field_section('basicgroup', 'Author', 'A', TRUE);
Creating Sections for &META&Tags
When you use the HTML_SECTION _GROUP, you can create sections for META tags.
Consider an HTML document that has a META tag as follows:
&META NAME="author" CONTENT="ken"&
To create a field section that indexes the CONTENT attribute for the &META NAME="author"& tag:
ctx_ddl.create_section_group('myhtmlgroup', 'HTML_SECTION_GROUP');
ctx_ddl.add_field_section('myhtmlgroup', 'author', 'META@AUTHOR');
After indexing with section group mygroup, you can query the document as follows:
'ken WITHIN author'
Limitations
Nested Sections
Field sections cannot be nested. For example, if you define a field section to start with &TITLE& and define another field section to start with &FOO&, the two sections cannot be nested as follows:
&TITLE& dog &FOO& cat &/FOO& &/TITLE&
To work with nested section define them as zone sections.
Repeated Sections
Repeated field sections are allowed, but WITHIN queries treat them as a single section. The following is an example of repeated field section in a document:
&TITLE& cat &/TITLE&
&TITLE& dog &/TITLE&
The query dog and cat within title returns the document, even though these words occur in different sections.
To have WITHIN queries distinguish repeated sections, define them as zone sections.
Related Topics
operator in .
Use this procedure to add an index to a catalog index preference. You create this preference to create catalog indexes of type CTXCAT.
CTX_DDL.ADD_INDEX(set_name in varchar2,
column_list varchar2,
storage_clause varchar2);
Specify the name of the index set.
column_list
Specify a comma separated list of columns to index.
storage_clause
Specify a storage clause.
Consider a table called AUCTION with the following schema:
create table auction(
item_id number,
title varchar2(100),
category_id number,
price number,
bid_close date);
Assume that queries on the table involve a mandatory text query clause and optional structured conditions on category_id. Results must be sorted based on bid_close.
You can create a catalog index to support the different types of structured queries a user might enter.
To create the indexes, first create the index set preference then add the required indexes to it:
ctx_ddl.create_index_set('auction_iset');
ctx_ddl.add_index('auction_iset','bid_close');
ctx_ddl.add_index('auction_iset','category_id, bid_close');
ctx_ddl.add_index('auction_iset','price, bid_close');
Create the combined catalog index with CREATE INDEX as follows:
create index auction_titlex on AUCTION(title) indextype is CTXCAT parameters
('index set auction_iset');
To query the title column for the word pokemon, you can issue regular and mixed queries as follows:
select * from AUCTION where CATSEARCH(title, 'pokemon',NULL)& 0;
select * from AUCTION where CATSEARCH(title, 'pokemon', 'category_id=99 order by
bid_close desc')& 0;
Adds a special section, either SENTENCE or PARAGRAPH, to a section group. This enables searching within sentences or paragraphs in documents with the
A special section in a document is a section which is not explicitly tagged like zone and field sections. The start and end of special sections are detected when the index is created. Oracle supports two such sections: paragraph and sentence.
The sentence and paragraph boundaries are determined by the lexer.For example, the lexer recognizes sentence and paragraph section boundaries as follows:
The punctuation, whitespace, and newline characters are determined by your lexer settings and can be changed.
If the lexer cannot recognize the boundaries, no sentence or paragraph sections are indexed.
CTX_DDL.ADD_SPECIAL_SECTION(
group_name
IN VARCHAR2,
section_name
IN VARCHAR2);
group_name
Specify the name of the section group.
section_name
Specify SENTENCE or PARAGRAPH.
The following code enables searching within sentences within HTML documents:
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
ctx_ddl.add_special_section('htmgroup', 'SENTENCE');
You can also add zone sections to the group to enable zone searching in addition to sentence searching. The following example adds the zone section Headline to the section group htmgroup:
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
ctx_ddl.add_special_section('htmgroup', 'SENTENCE');
ctx_ddl.add_zone_section('htmgroup', 'Headline', 'H1');
If you are only interested in sentence or paragraph searching within documents and not interested in defining zone or field sections, you can use the NULL_SECTION_GROUP as follows:
ctx_ddl.create_section_group('nullgroup', 'NULL_SECTION_GROUP');
ctx_ddl.add_special_section('nullgroup', 'SENTENCE');
Related Topics
operator in .
Adds a stopclass to a stoplist. A stopclass is a class of tokens that is not to be indexed.
CTX_DDL.ADD_STOPCLASS(
stoplist_name
stoplist_name
Specify the name of the stoplist.
Specify the stopclass to be added to stoplist_name. Currently, only the NUMBERS class is supported.
The maximum number of stopwords, stopthemes, and stopclasses you can add to a stoplist is 4095.
The following code adds a stopclass of NUMBERS to the stoplist mystop:
ctx_ddl.add_stopclass('mystop', 'NUMBERS');
Related Topics
Adds a stop section to an automatic section group. Adding a stop section causes the automatic section indexing operation to ignore the specified section in XML documents.
Adding a stop section causes no section information to be created in the index. However, the text within a stop section is always searchable.
Adding a stop section is useful when your documents contain many low information tags. Adding stop sections also improves indexing performance with the automatic section group.
The number of stop sections you can add is unlimited.
Stop sections do not have section names and hence are not recorded in the section views.
CTX_DDL.ADD_STOP_SECTION(
section_group IN VARCHAR2,
IN VARCHAR2);
section_group
Specify the name of the automatic section group. If you do not specify an automatic section group, this procedure returns an error.
Specify the tag to ignore during indexing. This parameter is case-sensitive. Defining a stop tag as such also stops the tag's attribute sections, if any.
You can qualify the tag with document type in the form (doctype)tag. For example, if you wanted to make the &fluff& tag a stop section only within the mydoc document type, specify (mydoc)fluff for tag.
Defining Stop Sections
The following code adds a stop section identified by the tag &fluff& to the automatic section group myauto:
ctx_ddl.add_stop_section('myauto', 'fluff');
This code also stops any attribute sections contained within &fluff&. For example, if a document contained:
&fluff type="computer"&
Then the above code also stops the attribute section fluff@type.
Doctype Sensitive Stop Sections
The following code creates a stop section for the tag &fluff& only in documents that have a root element of mydoc:
ctx_ddl.add_stop_section('myauto', '(mydoc)fluff');
Related Topics
Adds a single stoptheme to a stoplist. A stoptheme is a theme that is not to be indexed.
In English, you query on indexed themes using the
CTX_DDL.ADD_STOPTHEME(
stoplist_name
stoplist_name
Specify the name of the stoplist.
Specify the stoptheme to be added to stoplist_name. The system normalizes the stoptheme you enter using the knowledge base. If the normalized theme is more than one theme, the system does not process your stoptheme. For this reason, Oracle recommends that you submit single stopthemes.
The maximum number of stopwords, stopthemes, and stopclasses you can add to a stoplist is 4095.
The following example adds the stoptheme banking to the stoplist mystop:
ctx_ddl.add_stoptheme('mystop', 'banking');
Related Topics
operator in .
Use this procedure to add a single stopword to a stoplist.
To create a list of stopwords, you must call this procedure once for each word.
CTX_DDL.ADD_STOPWORD(
stoplist_name
in varchar2 default NULL
stoplist_name
Specify the name of the stoplist.
Specify the stopword to be added.
Language-specific stopwords must be unique across the other stopwords specific to the language. For example, it is valid to have a German die and an English die in the same stoplist.
The maximum number of stopwords, stopthemes, and stopclasses you can add to a stoplist is 4095.
Specify the language of stopword when the stoplist you specify with stoplist_name is of type MULTI_STOPLIST. You must specify the Globalization Support name or abbreviation of an Oracle-supported language.
To make a stopword active in multiple languages, specify ALL for this parameter. For example, defining ALL stopwords is useful when you have international documents that contain English fragments that need to be stopped in any language.
An ALL stopword is active in all languages. If you use the multi-lexer, the language-specific lexing of the stopword occurs, just as if it had been added multiple times in multiple specific languages.
Otherwise, specify NULL.
Single Language Stoplist
The following example adds the stopwords because, notwithstanding, nonetheless, and therefore to the stoplist mystop:
ctx_ddl.add_stopword('mystop', 'because');
ctx_ddl.add_stopword('mystop', 'notwithstanding');
ctx_ddl.add_stopword('mystop', 'nonetheless');
ctx_ddl.add_stopword('mystop', 'therefore');
Multi-Language Stoplist
The following example adds the German word die to a multi-language stoplist:
ctx_ddl.add_stopword('mystop', 'Die','german');
You can add stopwords after you create the index with ALTER INDEX.
Adding An ALL Stopword
The following adds the word the as an ALL stopword to the multi-language stoplist globallist:
ctx_ddl.add_stopword('globallist','the','ALL');
Related Topics
Add a sub-lexer to a multi-lexer preference. A sub-lexer identifies a language in a multi-lexer (multi-language) preference. Use a multi-lexer preference when you want to index more than one language.
Restrictions
The following restrictions apply to using CTX_DDL.ADD_SUB_LEXER:
The invoking user must be the owner of the multi-lexer or CTXSYS.
The lexer_name parameter must name a preference which is a multi-lexer lexer.
A lexer for default must be defined before the multi-lexer can be used in an index.
The sub-lexer preference owner must be the same as multi-lexer preference owner.
The sub-lexer preference must not be a multi-lexer lexer.
A sub-lexer preference cannot be dropped while it is being used in a multi-lexer preference.
CTX_DDL.ADD_SUB_LEXER records only a reference. The sub-lexer values are copied at create index time to index value storage.
CTX_DDL.ADD_SUB_LEXER(
lexer_name in varchar2,
in varchar2,
sub_lexer in varchar2,
alt_value in varchar2 default null
lexer_name
Specify the name of the multi-lexer preference.
Specify the Globalization Support language name or abbreviation of the sub-lexer. For example, you can specify ENGLISH or EN for English.
The sub-lexer you specify with sub_lexer is used when the language column has a value case-insensitive equal to the Globalization Support name of abbreviation of language.
Specify DEFAULT to assign a default sub-lexer to use when the value of the language column in the base table is null, invalid, or unmapped to a sub-lexer. The DEFAULT lexer is also used to parse stopwords.
If a sub-lexer definition for language already exists, then it is replaced by this call.
Specify the name of the sub-lexer to use for this language.
Optionally specify an alternate value for language.
If you specify DEFAULT for language, you cannot specify an alt_value.
The alt_value is limited to 30 bytes and cannot be an Globalization Support language name, abbreviation, or DEFAULT.
This example shows how to create a multi-language text table and how to set up the multi-lexer to index the table.
Create the multi-language table with a primary key, a text column, and a language column as follows:
create table globaldoc (
doc_id number primary key,
lang varchar2(3),
Assume that the table holds mostly English documents, with the occasional German or Japanese document. To handle the three languages, you must create three sub-lexers, one for English, one for German, and one for Japanese:
ctx_ddl.create_preference('english_lexer','basic_lexer');
ctx_ddl.set_attribute('english_lexer','index_themes','yes');
ctx_ddl.set_attribtue('english_lexer','theme_language','english');
ctx_ddl.create_preference('german_lexer','basic_lexer');
ctx_ddl.set_attribute('german_lexer','composite','german');
ctx_ddl.set_attribute('german_lexer','mixed_case','yes');
ctx_ddl.set_attribute('german_lexer','alternate_spelling','german');
ctx_ddl.create_preference('japanese_lexer','japanese_vgram_lexer');
Create the multi-lexer preference:
ctx_ddl.create_preference('global_lexer', 'multi_lexer');
Since the stored documents are mostly English, make the English lexer the default:
ctx_ddl.add_sub_lexer('global_lexer','default','english_lexer');
Add the German and Japanese lexers in their respective languages. Also assume that the language column is expressed in ISO 639-2, so we add those as alternate values.
ctx_ddl.add_sub_lexer('global_lexer','german','german_lexer','ger');
ctx_ddl.add_sub_lexer('global_lexer','japanese','japanese_lexer','jpn');
Create the index globalx, specifying the multi-lexer preference and the language column in the parameters string as follows:
create index globalx on globaldoc(text) indextype is ctxsys.context
parameters ('lexer global_lexer language column lang');
Creates a zone section and adds the section to an existing section group. This enables zone section searching with the
Zone sections are sections delimited by start and end tags. The &B& and &/B& tags in HTML, for instance, marks a range of words which are to be rendered in boldface.
Zone sections can be nested within one another, can overlap, and can occur more than once in a document.
CTX_DDL.ADD_ZONE_SECTION(
group_name
section_name
group_name
Specify the name of the section group to which section_name is added.
section_name
Specify the name of the section to add to the group_name. You use this name to identify the section in WITHIN queries. Avoid using names that contain non-alphanumeric characters such as _, since most of these characters are special must be escaped in queries. Section names are case-insensitive.
Within the same group, zone section names and field section names cannot be the same. The terms Paragraph and Sentence are reserved for special sections.
Section names need not be unique across tags. You can assign the same section name to more than one tag, making details transparent to searches.
Specify the pattern which marks the start of a section. For example, if &H1& is the HTML tag, specify H1 for tag. The start tag you specify must be unique within a section group.
Oracle knows what the end tags look like from the group_type parameter you specify when you create the section group.
If group_name is an HTML_SECTION_GROUP, you can create zone sections for the META tag's NAME/CONTENT attribute pairs. To do so, specify tag as meta@namevalue where namevalue is the value of the NAME attribute whose CONTENT attributes are to be indexed as a section. Refer to the example.
If group_name is an XML_SECTION_GROUP, you can optionally qualify tag with a document type (root element) in the form (doctype)tag. Doing so makes section_name sensitive to the XML document type declaration. Refer to the example.
Creating HTML Sections
The following code defines a section group called htmgroup of type HTML_SECTION_GROUP. It then creates a zone section in htmgroup called headline identified by the &H1& tag:
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
ctx_ddl.add_zone_section('htmgroup', 'heading', 'H1');
After indexing with section group htmgroup, you can query within the heading section by issuing a query as follows:
'Oracle WITHIN heading'
Creating Sections for &META NAME&Tags
You can create zone sections for HTML META tags when you use the HTML_SECTION_GROUP.
Consider an HTML document that has a META tag as follows:
&META NAME="author" CONTENT="ken"&
To create a zone section that indexes all CONTENT attributes for the META tag whose NAME value is author:
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
ctx_ddl.add_zone_section('htmgroup', 'author', 'meta@author');
After indexing with section group htmgroup, you can query the document as follows:
'ken WITHIN author'
Creating Document Type Sensitive Sections (XML Documents Only)
You have an XML document set that contains the &book& tag declared for different document types. You want to create a distinct book section for each document type.
Assume that mydocname is declared as an XML document type (root element) as follows:
&!DOCTYPE mydocname ... [...
Within mydocname, the element &book& is declared. For this tag, you can create a section named mybooksec that is sensitive to the tag's document type as follows:
ctx_ddl.create_section_group('myxmlgroup', 'XML_SECTION_GROUP');
ctx_ddl.add_zone_section('myxmlgroup', 'mybooksec', '(mydocname)book');
Repeated Sections
Zone sections can repeat. Each occurrence is treated as a separate section. For example, if &H1& denotes a heading section, they can repeat in the same documents as follows:
&H1& The Brown Fox &/H1&
&H1& The Gray Wolf &/H1&
Assuming that these zone sections are named Heading, the query Brown WITHIN Heading returns this document. However, a query of (Brown and Gray) WITHIN Heading does not.
Overlapping Sections
Zone sections can overlap each other. For example, if &B& and &I& denote two different zone sections, they can overlap in document as follows:
plain &B& bold &I& bold and italic &/B& only italic &/I&
Nested Sections
Zone sections can nest, including themselves as follows:
&TD& &TABLE&&TD&nested cell&/TD&&/TABLE&&/TD&
Using the WITHIN operator, you can write queries to search for text in sections within sections. For example, assume the BOOK1, BOOK2, and AUTHOR zone sections occur as follows in documents doc1 and doc2:
&book1& &author&Scott Tiger&/author& This is a cool book to read.&book1&
&book2& &author&Scott Tiger&/author& This is a great book to read.&book2&
Consider the nested query:
'Scott within author within book1'
This query returns only doc1.
Related Topics
operator in .
Creates an index set for CTXCAT index types. You name this index set in the parameter clause of CREATE INDEX when you create a CTXCAT index.
CTX_DDL.CREATE_INDEX_SET(set_name in
varchar2);
Specify the name of the index set. You name this index set in the parameter clause of CREATE INDEX when you create a CTXCAT index.
CREATE_POLICY
Creates a policy to use with the ORA:CONTAINS function. ORA:CONTAINS is a function you use within an XPATH query expression with existsNode.
CTX_DDL.CREATE_POLICY(
policy_name
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
section_group
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL);
policy_name
Specify the name for the new policy.
Specify the filter preference to use.
In this release, this parameter is not used.
section_group
Specify the section group to use. You can specify only NULL_SECTION_GROUP. Only special (sentence and paragraph) section are supported.
Specify the lexer preference to use. Your INDEX_THEMES attribute must be disabled.
specify the stoplist to use.
Specify the wordlist to use.
Create mylex lexer preference named mylex.
ctx_ddl.create_preference('mylex', 'BASIC_LEXER');
ctx_ddl.set_attribute('mylex', 'printjoins', '_-');
ctx_ddl.set_attribute ( 'mylex', 'index_themes', 'NO');
ctx_ddl.set_attribute ( 'mylex', 'index_text', 'YES');
Create a stoplist preference named mystop.
ctx_ddl.create_stoplist('mystop', 'BASIC_STOPLIST');
ctx_ddl.add_stopword('mystop', 'because');
ctx_ddl.add_stopword('mystop', 'nonetheless');
ctx_ddl.add_stopword('mystop', 'therefore');
Create a wordlist preference named 'mywordlist'.
ctx_ddl.create_preference('mywordlist', 'BASIC_WORDLIST');
ctx_ddl.set_attribute('mywordlist','FUZZY_MATCH','ENGLISH');
ctx_ddl.set_attribute('mywordlist','FUZZY_SCORE','0');
ctx_ddl.set_attribute('mywordlist','FUZZY_NUMRESULTS','5000');
ctx_ddl.set_attribute('mywordlist','SUBSTRING_INDEX','TRUE');
ctx_ddl.set_attribute('mywordlist','STEMMER','ENGLISH');
exec ctx_ddl.create_policy('my_policy', NULL, NULL, 'mylex', 'mystop',
'mywordlist');
exec ctx_ddl.create_policy(policy_name =& 'my_policy',
lexer =& 'mylex',
stoplist =& 'mystop',
wordlist =& 'mywordlist');
Then you can issue the following ExistsNode() query with your own defined policy:
select id from xmltab
where existsNode(doc, '/book/chapter[ ora:contains(summary,"dog or cat", "my_
policy") &0 ]' );
You can update your policy by doing:
exec ctx_ddl.update_policy(policy_name =& 'my_policy', lexer =& 'my_new_lex');
You can drop your policy by doing:
exec ctx_ddl.drop_policy(policy_name =& 'my_policy');
Creates a preference in the Text data dictionary. You specify preferences in the parameter string of
CTX_DDL.CREATE_PREFERENCE(preference_name
in varchar2,
object_name
in varchar2);
preference_name
Specify the name of the preference to be created.
object_name
Specify the name of the preference type.
For a complete list of preference types and their associated attributes, see .
Creating Text-only Index
The following example creates a lexer preference that specifies a text-only index. It does so by creating a BASIC_LEXER preference called my_lexer with CTX_DDL.CREATE_PREFERENCE. It then calls CTX_DDL. twice, first specifying Y for the INDEX_TEXT attribute, then specifying N for the INDEX_THEMES attribute.
ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER');
ctx_ddl.set_attribute('my_lexer', 'INDEX_TEXT', 'YES');
ctx_ddl.set_attribute('my_lexer', 'INDEX_THEMES', 'NO');
Specifying File Data Storage
The following example creates a data storage preference called mypref that tells the system that the files to be indexed are stored in the operating system. The example then uses CTX_DDL. to set the PATH attribute of to the directory /docs.
ctx_ddl.create_preference('mypref', 'FILE_DATASTORE');
ctx_ddl.set_attribute('mypref', 'PATH', '/docs');
For more information about data storage, see
Creating Master/Detail Relationship
You can use CTX_DDL. to create a preference with DETAIL_DATASTORE. You use CTX_DDL. to set the attributes for this preference. The following example shows how this is done:
ctx_ddl.create_preference('my_detail_pref', 'DETAIL_DATASTORE');
ctx_ddl.set_attribute('my_detail_pref', 'binary', 'true');
ctx_ddl.set_attribute('my_detail_pref', 'detail_table', 'my_detail');
ctx_ddl.set_attribute('my_detail_pref', 'detail_key', 'article_id');
ctx_ddl.set_attribute('my_detail_pref', 'detail_lineno', 'seq');
ctx_ddl.set_attribute('my_detail_pref', 'detail_text', 'text');
For more information about master/detail, see
Specifying Storage Attributes
The following examples specify that the index tables are to be created in the foo tablespace with an initial extent of 1K:
ctx_ddl.create_preference('mystore', 'BASIC_STORAGE');
ctx_ddl.set_attribute('mystore', 'I_TABLE_CLAUSE',
'tablespace foo storage (initial 1K)');
ctx_ddl.set_attribute('mystore', 'K_TABLE_CLAUSE',
'tablespace foo storage (initial 1K)');
ctx_ddl.set_attribute('mystore', 'R_TABLE_CLAUSE',
'tablespace foo storage (initial 1K)');
ctx_ddl.set_attribute('mystore', 'N_TABLE_CLAUSE',
'tablespace foo storage (initial 1K)');
ctx_ddl.set_attribute('mystore', 'I_INDEX_CLAUSE',
'tablespace foo storage (initial 1K)');
Creating Preferences with No Attributes
When you create preferences with types that have no attributes, you need only create the preference, as in the following example which sets the filter to the NULL_FILTER:
ctx_ddl.create_preference('my_null_filter', 'NULL_FILTER');
Related Topics
Creates a section group for defining sections in a text column.
When you create a section group, you can add to it zone, field, or special sections with , , or .
When you index, you name the section group in the parameter string of
After indexing, you can query within your defined sections with the
CTX_DDL.CREATE_SECTION_GROUP(
group_name
group_type
group_name
Specify the section group name to create as [user.]section_group_name. This parameter must be unique within an owner.
group_type
Specify section group type. The group_type parameter can be one of:
The following command creates a section group called htmgroup with the HTML group type.
ctx_ddl.create_section_group('htmgroup', 'HTML_SECTION_GROUP');
The following command creates a section group called auto with the AUTO_SECTION_GROUP group type to be used to automatically index tags in XML documents.
ctx_ddl.create_section_group('auto', 'AUTO_SECTION_GROUP');
Related Topics
operator in .
Use this procedure to create a new, empty stoplist. Stoplists can contain words or themes that are not to be indexed.
You can also create multi-language stoplists to hold language-specific stopwords. A multi-language stoplist is useful when you index a table that contains documents in different languages, such as English, German, and Japanese. When you do so, you text table must contain a language column.
You can add either stopwords, stopclasses, or stopthemes to a stoplist using , , or .
You can specify a stoplist in the parameter string of
to override the default stoplist .
CTX_DDL.CREATE_STOPLIST(
stoplist_name IN VARCHAR2,
stoplist_type IN VARCHAR2 DEFAULT 'BASIC_STOPLIST');
stoplist_name
Specify the name of the stoplist to be created.
stoplist_type
Specify BASIC_STOPLIST to create a stoplist for a single language. This is the default.
Specify MULTI_STOPLIST to create a stoplist with language-specific stopwords.
At indexing time, the language column of each document is examined, and only the stopwords for that language are eliminated. At query time, the session language setting determines the active stopwords, like it determines the active lexer when using the multi-lexer.
When indexing a multi-language table with a multi-language stoplist, your table must have a language column.
Single Language Stoplist
The following code creates a stoplist called mystop:
ctx_ddl.create_stoplist('mystop', 'BASIC_STOPLIST');
Multi-Language Stoplist
The following code creates a multi-language stoplist called multistop and then adds tow language-specific stopwords:
ctx_ddl.create_stoplist('multistop', 'MULTI_STOPLIST');
ctx_ddl.add_stopword('mystop', 'Die','german');
ctx_ddl.add_stopword('mystop', 'Or','english');
Related Topics
Drops an index set.
CTX_DDL.DROP_INDEX_SET(set_name in varchar2);
Specify the name of the index set to drop.
DROP_POLICY
Drops a policy create with CREATE_POLICY.
CTX_DDL.DROP_POLICY(policy_name IN VARCHAR2);
policy_name
Specify the name of the policy to drop.
The DROP_PREFERENCE procedure deletes the specified preference from the Text data dictionary. Dropping a preference does not affect indexes that have already been created using that preference.
CTX_DDL.DROP_PREFERENCE(preference_name IN VARCHAR2);
preference_name
Specify the name of the preference to be dropped.
The following code drops the preference my_lexer.
ctx_ddl.drop_preference('my_lexer');
Related Topics
The DROP_SECTION_GROUP procedure deletes the specified section group, as well as all the sections in the group, from the Text data dictionary.
CTX_DDL.DROP_SECTION_GROUP(group_name IN VARCHAR2);
group_name
Specify the name of the section group to delete.
The following code drops the section group htmgroup and all its sections:
ctx_ddl.drop_section_group('htmgroup');
Related Topics
Drops a stoplist from the Text data dictionary. When you drop a stoplist, you must re-create or rebuild the index for the change to take effect.
CTX_DDL.DROP_STOPLIST(stoplist_name in varchar2);
stoplist_name
Specify the name of the stoplist.
The following code drops the stoplist mystop:
ctx_ddl.drop_stoplist('mystop');
Related Topics
Use this procedure to optimize the index. You optimize your index after you synchronize it. Optimizing the index removes old data and minimizes index fragmentation. Optimizing the index can improve query response time.
You can optimize in fast, full, or token mode. In token mode, you specify a specific token to be optimized. You can use token mode to optimize index tokens that are frequently searched, without spending time on optimizing tokens that are rarely referenced. An optimized token can improve query response time for that token.
Optimizing an index can result in better response time only if you insert, delete, or update documents in your base table after your initial indexing operation.
Using this procedure to optimize your index is recommended over using the ALTER INDEX statement.
Limitations
The CTX_DDL.OPTIMIZE_INDEX procedure optimizes at most 16,000 document ids. To continue optimizing more document ids, re-run this procedure.
CTX_DDL.OPTIMIZE_INDEX(
NUMBER DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL,
parallel_degree IN VARCHAR2);
Specify the name of the index. If you do not specify an index name, Oracle chooses a single index to optimize.
Specify optimization level as a string. You can specify one of the following methods for optimization:
Specify maximum optimization time, in minutes, for FULL optimize.
When you specify the symbol CTX_DDL.MAXTIME_UNLIMITED (or pass in NULL), the entire index is optimized. This is the default.
Specify the token to be optimized.
Specify the name of the index partition to optimize.
parallel_degree
Specify the parallel degree as a number for parallel optimization. The actual parallel degree depends on your resources.
The following two examples optimize the index for fast optimization.
ctx_ddl.optimize_index('myidx','FAST');
ctx_ddl.optimize_index('myidx',CTX_DDL.OPTLEVEL_FAST);
The following example optimizes the index token Oracle:
ctx_ddl.optimize_index('myidx','token', TOKEN=&'Oracle');
Related Topics
Removes the index with the specified column list from a CTXCAT index set preference.
This procedure does not remove a CTXCAT sub-index from the existing index. To do so, you must drop your index and re-index with the modified index set preference.
CTX_DDL.REMOVE_INDEX(
set_name in varchar2,
column_list in varchar2
language in varchar2 default NULL
Specify the name of the index set
column_list
Specify the name of the column list to remove.
The REMOVE_SECTION procedure removes the specified section from the specified section group. You can specify the section by name or by id. You can view section id with the CTX_USER_SECTIONS view.
Use the following syntax to remove a section by section name:
CTX_DDL.REMOVE_SECTION(
group_name
section_name
group_name
Specify the name of the section group from which to delete section_name.
section_name
Specify the name of the section to delete from group_name.
Use the following syntax to remove a section by section id:
CTX_DDL.REMOVE_SECTION(
group_name
section_id
group_name
Specify the name of the section group from which to delete section_id.
section_id
Specify the section id of the section to delete from group_name.
The following code drops a section called Title from the htmgroup:
ctx_ddl.remove_section('htmgroup', 'Title');
Related Topics
Removes a stopclass from a stoplist.
CTX_DDL.REMOVE_STOPCLASS(
stoplist_name
stoplist_name
Specify the name of the stoplist.
Specify the name of the stopclass to be removed.
The following code removes the stopclass NUMBERS from the stoplist mystop.
ctx_ddl.remove_stopclass('mystop', 'NUMBERS');
Related Topics
Removes a stoptheme from a stoplist.
CTX_DDL.REMOVE_STOPTHEME(
stoplist_name
stoplist_name
Specify the name of the stoplist.
Specify the stoptheme to be removed from stoplist_name.
The following code removes the stoptheme banking from the stoplist mystop:
ctx_ddl.remove_stoptheme('mystop', 'banking');
Related Topics
Removes a stopword from a stoplist. To have the removal of a stopword be reflected in the index, you must rebuild your index.
CTX_DDL.REMOVE_STOPWORD(
stoplist_name
varchar2 default NULL
stoplist_name
Specify the name of the stoplist.
Specify the stopword to be removed from stoplist_name.
Specify the language of stopword to remove when the stoplist you specify with stoplist_name is of type MULTI_STOPLIST. You must specify the Globalization Support name or abbreviation of an Oracle-supported language. You can also remove ALL stopwords.
The following code removes a stopword because from the stoplist mystop:
ctx_ddl.remove_stopword('mystop','because');
Related Topics
Sets a preference attribute. You use this procedure after you have created a preference with CTX_DDL..
ctx_ddl.set_attribute(preference_name in varchar2,
attribute_name
in varchar2,
attribute_value in varchar2);
preference_name
Specify the name of the preference.
attribute_name
Specify the name of the attribute.
attribute_value
Specify the attribute value. You can specify boolean values as TRUE or FALSE, T or F, YES or NO, Y or N, ON or OFF, or 1 or 0.
Specifying File Data Storage
The following example creates a data storage preference called filepref that tells the system that the files to be indexed are stored in the operating system. The example then uses CTX_DDL. to set the PATH attribute to the directory /docs.
ctx_ddl.create_preference('filepref', 'FILE_DATASTORE');
ctx_ddl.set_attribute('filepref', 'PATH', '/docs');
For more information about data storage, see
For more examples of using SET_ATTRIBUTE, see .
Synchronizes the index to process inserts, updates, and deletes to the base table.
ctx_ddl.sync_index(
VARCHAR2 DEFAULT NULL
memory IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL
parallel_degree IN NUMBER DEFAULT 1);
Specify the name of the index.
Specify the runtime memory to use for synchronization. This value overrides the DEFAULT_INDEX_MEMORY system parameter.
The memory parameter specifies the amount of memory Oracle uses for the synchronization operation before flushing the index to disk. Specifying a large amount of memory:
improves indexing performance because there is less I/O
improves query performance and maintenance because there is less fragmentation
Specifying smaller amounts of memory increases disk I/O and index fragmentation, but might be useful when runtime memory is scarce.
Specify the name of the index partition to synchronize.
parallel_degree
Specify the degree to run parallel synchronize. A number greater than 1 turns on parallel synchronize. The actual degree of parallelism might be smaller depending on your resources.
The following example synchronizes the index myindex with 2 megabytes of memory:
ctx_ddl.sync_index('myindex', '2M');
The following example synchronizes the part1 index partition with 2 megabytes of memory:
ctx_ddl.sync_index('myindex', '2M', 'part1');
Related Topics
Removes a set attribute from a preference.
CTX_DDL.UNSET_ATTRIBUTE(preference_name varchar2,
attribute_name
varchar2);
preference_name
Specify the name of the preference.
attribute_name
Specify the name of the attribute.
Enabling/Disabling Alternate Spelling
The following example shows how you can enable alternate spelling for German and disable alternate spelling with CTX_DDL.UNSET_ATTRIBUTE:
ctx_ddl.create_preference('GERMAN_LEX', 'BASIC_LEXER');
ctx_ddl.set_attribute('GERMAN_LEX', 'ALTERNATE_SPELLING', 'GERMAN');
To disable alternate spelling, use the CTX_DDL.UNSET_ATTRIBUTE procedure as follows:
ctx_ddl.unset_attribute('GERMAN_LEX', 'ALTERNATE_SPELLING');
Related Topics
UPDATE_POLICY
Updates a policy created with CREATE_POLICY. Replaces the preferences of the policy. Null arguments are not replaced.
CTX_DDL.UPDATE_POLICY(
policy_name
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
section_group
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL,
IN VARCHAR2 DEFAULT NULL);
policy_name
Specify the name of the policy to update.
Specify the filter preference to use.
section_group
Specify the section group to use.
Specify the lexer preference to use.
specify the stoplist to use.
Specify the wordlist to use.
All Rights Reserved.

我要回帖

更多关于 oralce教程 的文章

 

随机推荐