Chapter 1 Quickstart
This section describes how content is crawled and indexed by Krugle Enterprise. Once your data is crawled and indexed, it can then be searched by your organization's users using the client features in Krugle Enterprise. Krugle "Projects" define the collections of content that are searchable with Krugle Enterprise. Krugle Projects can be defined by an Administrator using the Krugle Administration Console.
Projects can refer to a single file or any/all files in one or more data repositories. A data repository can be a file system, a source code management system, an issue tracking system, a database or any other information system that can be accessed via a Krugle SCM Connector.
Factors to consider when setting Krugle Projects
The first step in setting up your Krugle Projects is deciding which content you want to define in Krugle Enterprise. Some of the factors that should be considered in defining individual Krugle Projects include:
Importance of defining Krugle Projects
The following scenarios illustrate how Projects in Krugle can be defined to ensure good search and reporting results:
Projects are most commonly defined as collections of standalone content files, code or data records or libraries of content. These groups of files are commonly interrelated and maintained and accessed as a set of files. Most importantly, these file groupings fit the logical context and expectations of users who will be using Krugle.
Once you've decided how to organize your code into projects, collect the information needed to define each Project. At a minimum, the following information is required for each Project:
Note
If you lack the time or information needed to group content collections into individual projects, the simplest way to make your information searchable in Krugle is to associate a single Krugle Project with all files in each file system, repository or version control system.
The downside of having only one Project per content repository is that less information (in the form of project metadata) is available to create effective queries and refine code search results.
Note
You can add or remove Projects at any time after the initial configuration. This will allow for progressive refinement of Projects managed by Krugle Enterprise.
Setting up SCMI Connectors
Krugle Enterprise Data Access Mechanisms
For a given data repository, Krugle Enterprise accesses files using a Krugle SCM Connector. The SCM Connector approach is not limited to specific SCM systems, as it can provide access to file systems, issue tracking systems, and other non-SCM sources of data. SCM Connectors run outside of the Krugle Administration Console, and are installed and configured separately. The only information required by Krugle Enterprise for a Data Repository is the information needed to access the SCM Connector.
Krugle Basic Note - To download and install SCMI connectors, please contact Krugle Basic Support.
Defining Projects in Krugle Enterprise
This section explains how Krugle Projects are defined, using either the Krugle Enterprise UI (interactive) or via a mass import file.
Interactive Entry of Project Information
The easiest way to define Krugle Projects is to manually enter their specifications, one project at a time. This is the interactive approach, versus the mass import approach which is described below.
Creating a New Krugle Project
To define a Krugle Project interactively, first sign in to the Krugle Enterprise Console and navigate to the Projects section:
Specify Project Metadata
Add a Data Set to your Krugle Project
A Krugle Project consists of one or more Data Sets. A Krugle Data Set is defined as reference to (i) a single Data Repository and (ii) a Data Set Location within that Data Repository.
After specifying the Project Name and optional metadata, you must define the Data Sets for the Project. The first step in adding a Data Set is to specify a Data Repository.
Creating a Data Set with A New Data Repository
When creating a new Data Repository, set the fields as described by the Data Repository Fields table.
Once the Data Repository has been created, follow the steps described below, by first selecting this Data Repository from the dropdown list.
Creating A Data Set with An Existing Data Repository
If the Data Repository that you want to use for your Data Set has already been defined in Krugle:
OPTIONAL Project Information Mass Import
The Mass Import feature allows an Administrator to upload the definitions for multiple Projects with a single action. It is recommended that you verify proper operation of Krugle Enterprise and familiarize yourself with the interactive Project definition (previous section) before using the Mass Import feature. It is also recommended that when importing a large number of projects that you divide the mass import project collection into smaller groups - organized by repository. Start by importing several projects in a single file and increase the number of projects per mass import file as you progress.
To use Mass Import:
Sample mass import file
We have provided a sample mass import file to use for testing, and as a template for your own projects. To use this Sample Mass Import file:
Note
If a Mass Import file contains an exact duplicate of a Project that is already defined in Krugle, the instance of the duplicated Project in the Mass Import file will be ignored during the Mass Import process. If a Mass Import file contains an instance of a Project that is already defined in Krugle, and the file's information differs from what already exists, the information from Mass Import file will be used to update the Project.
Mass Import Project Information
Information about each field can be find in the Project Fields, Data Repository Fields, and Data Set Fields tables. These are all of the required fields:
If the Data Repository Name has not already been created by the Krugle Enterprise Administrator, then the following additional fields are required in the first row of the Mass Import file that uses the (new) Data Repository Name (subsequent rows can leave these fields blank):
Project Fields
This table is a complete list of all fields that can be defined for a Project. These fields are in addition to the one or more Data Sets that specify what content will be indexed as part of the Project.
Field | Mass Import Column Name | Description |
---|---|---|
Project name | Project name | A name that uniquely identifies a collection of content in Krugle Enterprise. This Project name can be used as a query filter by the end user and will be used in Project based reports and analysis. Whenever possible, use a descriptive name that will be familiar to users. A unique project name is required for each Project. Note: Krugle Project names are NOT case sensitive. |
Rank priority | Rank | (Optional, defaults to NORMAL) When users search on code they will see files from "boosted" Projects near the top of the list. Administrators can set the Project Rank priority to increase (or decrease) the visibility of selected Projects. This allows companies to ensure that code libraries, reuse components and other valued content collections have appropriate visibility in search results. Setting the Rank to IGNORE will prevent the project from showing up in search results. Possible values are IGNORE, LOW, NORMAL, and HIGH. |
Crawl frequency | Crawl frequency | (Optional, defaults to DAILY) How often the SCM Connector for each Data Set is queried to get updates. Possible values are TWICE_AN_HOUR, HOURLY, TWICE_A_DAY, DAILY, TWICE_A_WEEK, WEEKLY, TWICE_A_MONTH, MONTHLY, and ONCE. |
First Update | First Update | (Optional, defaults to the Project's creation time) Sets the earliest time for the first update after the initial sync. This is useful for delaying the first update until after a very large project has finished its initial crawl, as otherwise updates can "stack up" as they wait for the initial crawl to complete. |
Description | Description | (Optional, defaults to empty) This is a human readable description of the content in this Project. A one to two paragraph summary of the Project's capabilities, technologies, related Projects, Project dependencies, etc. will help future users of the Project better understand and use the information contained in this Project. The use of unique terms in the description will improve search matching for those unique terms. |
Homepage URL | Homepage URL | (Optional, defaults to empty) The homepage or project page for this Project. Use this optional URL reference to provide users with one-click access to non-code related information, the Project wiki, etc. |
Documentation URL | Documentation URL | (Optional, defaults to empty) This reference URL can be used to direct users to specifications, reference documentation and similar Project documents. |
Knowledgebase URL | Knowledge Base URL | (Optional, defaults to empty) This reference URL can provide users with a shortcut to an appropriate knowledge base from the Project description page. The knowledge base can reference information such as development notes, hints, tips or discussions. |
Bug database URL | Bug database URL | (Optional, defaults to empty) This reference URL can provide users with a shortcut to the Project bug database from the Project description page |
Owner | Owner | (Optional, defaults to no owner) A person who can be contacted with questions or issues about this Project. Usually, it is recommended that you enter the email address of the person responsible the Project. |
License | License | (Optional, defaults to no license) A code license type to be associated with all files in the Project. Typically this is used for open source code that your organization is using, so that users will know what restrictions are placed on the code that they find. |
Access Control | Access control | (Optional, defaults to --Unrestricted--) This setting specifies the groups (typically LDAP-based) that have access to this Project. In order to see or access files in a particular Project, a user must belong to one or more groups listed in this setting. The default setting (--Unrestricted--) will allow all users to access the Project. |
<not supported during manual definition> | Key name | (Optional, defaults to a hash code based on the Project's required fields) The unique identifier for the project. This can be up to 128 characters - lower case letters, numbers, underscore and hyphen are allowed. |
<not supported during manual definition> | Disable | (Optional, defaults to false) Disabling a project prevents it from being updated, and removes it from search results. To disable a project, first create it, then select it in the Projects Summary list, and click the Disable button at the top of the list. |
Data Repository Fields
This table is a complete list of all fields that can be defined for a Data Repository.
Field | Mass Import Column Name | Description |
---|---|---|
Data Repository Host Location | Host | The network address for the server that hosts the SCM Connector (for example, localhost or 192.169.25.231). |
Data Repository name | Data Repository name | Unique name to identify the data repository. |
Login | Login | (Optional) If the SCM Connector is configured to require authentication, this is the user name required when logging into the SCM Connector. |
Password | Password | (Optional) If the SCM Connector is configured to require authentication, this is the password required when logging into the SCM Connector. |
Path | Root path | The path portion of the URL used to access the SCM Connector. By default this will be "/repository", unless the SCM Connector has a custom configuration. |
Port | Port | The port used to access the SCM Connector. By default this will be 80 for HTTP, 443 for HTTPS, and 22 for SSH, unless the SCM Connector has a custom configuration. |
Connection type | Connection type | The protocol used to talk to the SCM Connector. Options are HTTP, HTTPS, and SSH. |
The "Data Repository type" field is only used by older versions of Krugle (V5 and earlier) during mass import, and if it exists in a mass import file being processed by Krugle V6 or later, it must be empty or set to "SCMI".
Data Set Fields
This table is a complete list of all fields that can be defined for a Data Set. A Data Set is based on a Data Repository, with additional fields to specify which project or sub-set of data available via the Data Repository is part of the Data Set.
Field | Mass Import Column Name | Description |
---|---|---|
Location | SCMI Data Set Location | This field defines the location of the Data Set's content within the Data Repository. For some Data Repositories it might not be required - see SCM Connectors for per-SCM Connector details. |
Location Alias | Alias for SCMI Data Set Location | This is currently not supported by Krugle V5 or V6. |
Parameter | Params for SCMI Data Set Location | This field defines additional parameters that are used when querying the Data Repository for the Data Set's content. For some Data Repositories it might not be required - see SCM Connectors for per-SCM Connector details. |
Alias | Data Set name | The name of this Data Set (unique per-project). Up to 64 characters - lower case letters, numbers, underscore and hyphen are allowed. This is used as part of the URL path to Data Set files. |
The "Data Set Location" field is only used by older versions of Krugle (V5 and earlier) during mass import, and if it exists in a mass import file being processed by Krugle V6 or later, it will be ignored.