More

Best way to get a bounding polygon of all features in a file .gdb

Best way to get a bounding polygon of all features in a file .gdb


I have an arcpy script that dynamically creates a filegeodatabase and its featureclasses from uploaded spatial data, currently point and line features and possibly polygons in the future. I am interested in what is the best way to go about creating a polygon around the entire contents of the gdb.

Ideally, it would be perfect if I there was a way to do minimum boudning geometry on a set of feature classes, or an entire geodatabase, and maybe that is a thing somewhere that I am not seeing.

I have had success just calling that and creating polygons for each feature class, and then take the union of all the polygons for the geodatabase, however as the datasets get large that seems like it wouldn't be a very efficient way of handling it.

Another way I'm looking at right now is getting a list of all the points and doing a bounding geometry algorithm on them, furthermore just getting the min/max xy of each polygon and run it against a list of those.
It just seems like there is a tool or something that I'm not aware of that does this and would be a better way of solving the problem.


You could loop through all the feature classes in the gdb and use the minimum bounding geometry gp tool to create convex hull polys in memory and then run a final minimum bounding geometry process on all these polys. This is untested, but something like this may work:

import os, arcpy def iter_ws(workspace, dataType="Any", ftype="ANY", wildcard="*"):"Iterates through a workspace using arcpy.da.Walk valid data types can be found @: http://resources.arcgis.com/en/help/main/10.1/index.html#//018w00000023000000 Required: workspace -- workspace to check for features Optional: dataType -- type of data to search for ftype -- feature and raster data types can be further filtered by this parameter wildcard -- wildcard for feature names"# find top level features and rasters for dirpath, dirnames, filenames in arcpy.da.Walk(workspace, datatype=dataType, type=ftype): for name in filenames: feature = os.path.join(dirpath, name) if wildcard != '*': if fnmatch.fnmatch(feature, os.path.join(dirpath, wildcard)): yield feature else: yield feature def gdb_bb(gdb, output_polygon): polys = [] for fc in iter_ws(gdb): tmp = r'in_memory{0}_poly'.format(os.path.basename(fc)) arcpy.MinimumBoundingGeometry_management(fc, tmp, "CONVEX_HULL", "ALL") polys.append(tmp) # merge polys merged = r'in_memorymerged' arcpy.Merge_management(polys, merged) # run convex hull again to create output arcpy.MinimumBoundingGeometry_management(merged, output_polygon, "CONVEX_HULL", "ALL") return output_polygon if __name__ == '__main__': gdb = r'path	oyour.gdb' final = r'path	oyour.gdbconvex_hull' gdb_bb(gdb, final)

For a full geodatabase, a quick solution is to create a bounding polygon based on the extents of each feature class. Of course, I assume that all your feature classes are in the same coordinate system.

you can loop on the feature classes of the geodatabase, request the extent and store the min, max values in X and Y. something like below (same for Y)

minX = 1000000000 maxX = 0 for featureClass in featureClasses: if (minX > arcpy.describe(featureClass).extent.XMin): minX = arcpy.describe(featureClass).extent.XMin if (maxX < arcpy.describe(featureClass).extent.XMax): maxX = arcpy.describe(featureClass).extent.XMax

Auto generate bounding polygon for a given image

I'am trying to create a list of points which is intended to describe the bounding polygon of the given image as closely as possible. The polygons will be used to calculate collisions between two graphics. So they have to be accurate and small (in terms of the points per polygon) at the same time.

My base images (e.g. PNG) are always monochrome pictures with a however shaped black "spot" in the middle. I have tried to achieve this by using the autotrace cli tool which converts my image to svg. I am currently parsing the paths within the svg to get the coordinates of the bounding polygon.

This solution works but is quite inefficient. The points generated by the autotrace command shown below are to accurate.

For the example shape attached to this post this generates

800 points, where - theoretically- 4 would be sufficient (one at each "corner").

As I am not interpolating curves at runtime, the resulting svg has to consist of lines only.

How to get a proper bounding polygon that is accurate enough but has got a small footprint?


Parameters

The point or polygon feature class for which hot spot analysis will be performed.

The output feature class to receive the z-score, p-value, and Gi_Bin results.

The numeric field (number of incidents, crime rates, test scores, and so on) to be evaluated.

The aggregation method to use to create weighted features for analysis from incident point data.

  • Count incidents within fishnet grid —A fishnet polygon mesh will overlay the incident point data and the number of incidents within each polygon cell will be counted. If no bounding polygon is provided in the Bounding Polygons Defining Where Incidents Are Possible parameter, only cells with at least one incident will be used in the analysis otherwise, all cells within the bounding polygons will be analyzed.
  • Count incidents within hexagon grid —A hexagon polygon mesh will overlay the incident point data and the number of incidents within each polygon cell will be counted. If no bounding polygon is provided in the Bounding Polygons Defining Where Incidents Are Possible parameter, only cells with at least one incident will be used in the analysis otherwise, all cells within the bounding polygons will be analyzed.
  • Count incidents within aggregation polygons —You provide aggregation polygons to overlay the incident point data in the Polygons For Aggregating Incidents Into Counts parameter. The incidents within each polygon are counted.
  • Snap nearby incidents to create weighted points —Nearby incidents will be aggregated together to create a single weighted point. The weight for each point is the number of aggregated incidents at that location.

A polygon feature class defining where the incident Input Features could possibly occur.

The polygons to use to aggregate the incident Input Features in order to get an incident count for each polygon feature.

The Density Surface parameter is disabled it remains as a tool parameter only to support backwards compatibility. The Kernel Density tool can be used if you would like a density surface visualization of your weighted points.

The size of the grid cells used to aggregate the Input Features . When aggregating into a hexagon grid, this distance is used as the height to construct the hexagon polygons.

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local clustering.

The point or polygon feature class for which hot spot analysis will be performed.

The output feature class to receive the z-score, p-value, and Gi_Bin results.

The numeric field (number of incidents, crime rates, test scores, and so on) to be evaluated.

The aggregation method to use to create weighted features for analysis from incident point data.

  • COUNT_INCIDENTS_WITHIN_FISHNET_POLYGONS — A fishnet polygon mesh will overlay the incident point data and the number of incidents within each polygon cell will be counted. If no bounding polygon is provided in the Bounding_Polygons_Defining_Where_Incidents_Are_Possible parameter, only cells with at least one incident will be used in the analysis otherwise, all cells within the bounding polygons will be analyzed.
  • COUNT_INCIDENTS_WITHIN_HEXAGON_POLYGONS — A hexagon polygon mesh will overlay the incident point data and the number of incidents within each polygon cell will be counted. If no bounding polygon is provided in the Bounding_Polygons_Defining_Where_Incidents_Are_Possible parameter, only cells with at least one incident will be used in the analysis otherwise, all cells within the bounding polygons will be analyzed.
  • COUNT_INCIDENTS_WITHIN_AGGREGATION_POLYGONS — You provide aggregation polygons to overlay the incident point data in the Polygons_For_Aggregating_Incidents_Into_Counts parameter. The incidents within each polygon are counted.
  • SNAP_NEARBY_INCIDENTS_TO_CREATE_WEIGHTED_POINTS — Nearby incidents will be aggregated together to create a single weighted point. The weight for each point is the number of aggregated incidents at that location.

A polygon feature class defining where the incident Input_Features could possibly occur.

The polygons to use to aggregate the incident Input_Features in order to get an incident count for each polygon feature.

The Density_Surface parameter is disabled it remains as a tool parameter only to support backwards compatibility. The Kernel Density tool can be used if you would like a density surface visualization of your weighted points.

The size of the grid cells used to aggregate the Input_Features . When aggregating into a hexagon grid, this distance is used as the height to construct the hexagon polygons.

The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local clustering.

Code sample

The following Python window script demonstrates how to use the OptimizedHotSpotAnalysis tool.

The following stand-alone Python script demonstrates how to use the OptimizedHotSpotAnalysis tool.


Lab 3. qPublic to ArcMap

Lab 3 to upload to ELC by next Thursday at midnight. Browse the properties referenced on the AFM property search page (https://americanforestmanagement.com/real-estate/property-search) for examples of ‘good’ maps – maps that you think convey the idea that this is the property for you! Most of the property pages I’ve looked at have a Related Documents section to the right of the page with links to aerial photo, topo, or location maps. I want you to find one map that contains something that you want to do in your maps. Copy into a Word document the URL, a screen shot of the map, and what it is that you want to be able to do that you see in the map – be specific. Upload this document to ELC. Do this after you work through the rest of the lab. I

Lab 3… You will be working in this project next Wednesday, so make sure you save everything to your working directory and that you copy this working directory up to your network space when you finish today.

You have been contacted by a client group who needs maps of four parcels scattered across three counties.

  • Two parcels in Walton county under the ownership “Dallyville LLC” (34.79 acre and 125 acre parcels) (figure 1)
  • One 424.4 acre property in Washington county under the ownership of “MACS Washington Properties LLC” (figure 2)
  • One 331.74 acre property in Elbert county under the ownership of “Broad River Properties LLC” (figure 3)

Figure 1. Dallyville, LLC. land holdings in Walton county, Georgia.

Figure 2. MACS Washington Properties, LLC. land holding in Washington county, Georgia.

Figure 3. Broad River Properties, LLC. holding in Elbert county, Georgia.

Processing Overview

  1. Create your working directory
  2. Download KMLS from qPublic
  3. Save the qPublic tax record for each property
  4. Run the ArcMap command KML TO LAYER to convert each KML file (from step 2) to an intermediate feature class
  5. Create one new file geodatabase called “PropertiesLLC” in which you will save all of the GIS layers you create
  6. Import the feature classes (from step 4) into the file geodatabase (from step 5)
  7. Run the ArcMap command MERGE (DATA MANAGEMENT) to combine the polygon features from the different layers (created in step 6) into a single layer called “PropertyBoundaries”
  8. Use the ArcMap PROJECT command to mathematically transform the Lat/Long boundaries to a projected coordinate system

At the end of this lab, you will have an ArcMap project with one GIS layer. This polygon feature class layer will contain all of your client’s property boundaries. You will continue on this project next week when you will determine the acreage of the different cover types and generate summaries for the properties.

You will see text outlined in orange throughout this lab. This is my way of indicating that you are about to have to do something. I have also included a written narrative with plenty of screenshots.

A final word on this lab… At the start, you download three of KML files and then import them into their own file geodatabase – you have three of these in your working directory. You then import these feature classes into a file geodatabase that you create – one file geodatabase with three feature classes. Next, you merge these three feature classes into one. When working on projects where there are multiple, non-adjacent land holdings, the GIS operator has to decide if it is best to work with individual feature classes or merge them all together (like we do in this lab). My advice is to merge the layers and process the one file instead of having to process three files – fewer chances for error.

Set up working directory

Create folder on the C: drive called “lab3properties”

qPublic.net (do these steps for each of the three properties)

Counties in Georgia, Colorado, Connecticut, Kentucky, Louisiana, and South Carolina provide access to online GIS parcel information through qPublic.net. This services enables users to search parcel-level records and, in many cases, download GIS-ready parcel boundaries. Please note, however, some counties provide this service free-of-charge while others require a daily/monthly/yearly per-county subscription (the daily subscription fee is $10 per county). For more information, visit qpublic.net.

NAVIGATING THE QPUBLIC SITE (in PICTURES)

  1. Google is a good place to start to find your county’s qPublic site
  2. Accept the statement in blue…
  3. If the site is NOT behind a pay-wall, enter your search criteria and hit the search button
    If the site IS behind a pay-wall, you must register and pay before entering the site. qPublic offers a reasonable 10$/day subscription level.
  4. Use the REPORT tab (along the top of the page) displays tax-related information for the property query
  5. Use the MAP tab to show the web-GIS
  6. The SEARCH tab takes you to monthly lists of Real Estate sales
  7. The SALES SEARCH tab takes you to a page where you can enter a custom query for Real Estate sales.

DOWNLOADing GOOGLE EARTH KML FROM QPUBLIC SITE

Locate the parcel of interest either by query (step 3 above) or by manually searching the GIS map (step 5 above). If you locate your parcel by manually searching the map, you need to zoom to your parcel and then use the select feature tool (tool #2 below) to select it.

While in the web GIS window (step 5 above), you will be presented with a list of selected parcels in the right-most part of the screen. If you want to view the property information the county has on file for this ownership, use the report link at the top of the screen. It never hurts to save this information in your working directory. You can save this page with a right-click > save as… You can also hit the print button and print to a PDF by changing the printer to PDF writer (you will most likely see something worded differently, but it will say PDF).

Once you are certain you have selected the correct parcel, use the Google Earth KML button (button #7) to save the KML boundary to your local machine.

Another word on file management.

If your web browser provides you an opportunity to specify where you want to save your file, then make sure you save it in your working directory. If your web browser, however, will probably automatically download to your Downloads folder. If and when this happens,

  • Download the file
  • Navigate to the Downloads folder using the File Explorer
  • Copy the KML you downloaded to your working directory
    • Recall that our working directory in labs 1 and 2 was c:ocnatforest

    CONVERT KML TO FILE GEODATABASE FEATURE CLASS

    At this point, you should have a file with a KML extension in your working directory – use File Explorer to verify. The KML file is the native Google Earth format. You can not analyze a KML in ArcMap so you must first use the KML TO LAYER tool in ArcMap to convert from a KML to a file geodatabase feature class.

    The KML TO LAYER tool is straight forward:

    • Input …: Fully specify the path to your KML (use the file/open widget to do th)
    • Output Location: Specify the path to your working directory (do not include the new file name)
    • Output Data Name: Name of the file geodatabase that will be created in this process. The file geodatabase must not already exist.

    This is what my working directory looks like in File Explorer before the KML TO LAYER process.

    This is what my working directory looks in File Explorer like after the KML TO LAYER process.

    In the above example, the folder “PropertyBndyKML.gdb” is a file geodatabase (a folder within my working directory). Do not save anything inside this *.gdb folder.

    You will see a new layer in your ArcMap project after this process (screen shot below). Change the symbology to a dark outline and a hollow fill.

    1. This indicates that your feature class is a polygon layer.
    2. I am displaying the TOC by source. My feature class is saved in a file geodatabase called PropertyBndyKML.

    Create file geodatabase and import layers

    ESRI’s file geodatabase (FGDB) is a more recent data standard in which the user may store vector (points, lines, polygons, raster, and tabular data. Run ArcMap’s CREATE FILE GDB tool to create your new FGDB.

    Import layers into your new file geodatabase

    One of many ways to import GIS data into a FGDB is to use ArcMap’s FEATURE CLASS TO GEODATABASE command.

    This import procedure creates three new feature classes in your PropertiesLLC FGDB. However, whenever possible, you want all of the polygons you plan to analyze stored in a single FGDB.

    Merge feature classes into one feature class

    This procedure will put all of the polygons, I also call these features, from the three layers you created by running the Feature Class to Geodatabase command above into one feature class. Call this new feature class “ PropertyBoundaries”.

    Project the PropertyBoundaries layer from Lat/Long coordinate system to UTM Zone 17, NAD1983 coordinate system

    Your data should be cast in a projected coordinate system any time you plan to make or report on area, perimeter, length, or any of the other possible geometries. The data from qPublic is cast in a Geographic coordinate system (Lat/Long). Use ArcMap’s PROJECT (DATA MANAGEMENT) command to mathematically transform your data to UTM coordinate space. In the Project dialog, you need to drill down and select (do not type) the Output Coordinate System NAD_1983_UTM_Zone_17N (hit the 4 th button to browse the coordinate systems and drill down to Projected Coordinate Systems UTM NAD 1983 NAD 1983 UTM Zone 17N).

    Prepare attribute table and calculate geometry

    Add two fields to your PropertyBoundariesUTM layer one called “mytype” (text) and another called “gisacres” (float). Your attribute table should look like this:

    If you followed this approach to the letter, you have a field called “FolderPath” that contains the property’s county. This is an attribute carried over from the original KMLs. The other two fields of interest are “mytype” and “gisacres”.

    If you are satisfied with your PropertyBoundariesUTM layer, remove all other layers in the TOC.

    Verify that the layer is saved in your C: workspace (right-click on layer > Properties > Source Tab).

    Save your project. Close your project.

    Copy your working directory up to your network drive.

    In practice, this entire process should take no more than 5 minutes.

    We will continue work on this project during Monday’s digitizing lecture.


    Project description

    Gdspy is a Python module for creation and manipulation of GDSII stream files. Key features for the creation of complex CAD layouts are included:

    • Boolean operations on polygons (AND, OR, NOT, XOR) based on clipping algorithm
    • Polygon offset (inward and outward rescaling of polygons)
    • Efficient point-in-polygon solutions for large array sets

    Gdspy also includes a simple layout viewer.

    Typical applications of Gdspy are in the fields of electronic chip design, planar lightwave circuit design, and mechanical engineering.

    In trying to improve the performance of Gdspy for large layouts, we ended up concluding that the best way to reach our goal was to rewrite the critical parts of the library as a C extension. It turns out that beside obvious functions, method calling has a big impact in performance due to the overhead it introduces. The best solution was to re-design the whole project as a C++ library with a thin Python wrapper: thus was born [Gdstk, the GDSII Tool Kit](https://github.com/heitzmann/gdstk).

    Therefore, version 1.6 will be the last major release of Gdspy, with development focused only on bug fixes. Users are encouraged to move from Gdspy to Gdstk: although their API is not 100% compatible, the new module should be familiar enough to allow a quick transition.

    • [Python](https://www.python.org/) (tested with versions 2.7, 3.6, 3.7, and 3.8)
    • [Numpy](http://numpy.scipy.org/)
    • C compiler (needed only if built from source)
    • Tkinter (optional: needed for the LayoutViewer GUI)
    • [Sphinx](https://www.sphinx-doc.org/) (optional: to build the documentation)

    `sh python -m pip install --user gdspy `

    Option 2: download the source from [github](https://github.com/heitzmann/gdspy) and build/install with:

    `sh python setup.py install `

    The preferred option is to install pre-compiled binaries from [here](https://github.com/heitzmann/gdspy/releases).

    Installation via pip and building from source as above are also possible, but an appropriate [build environment](https://wiki.python.org/moin/WindowsCompilers) is required for compilation of the C extension modules.

    The complete documentation is available [here](http://gdspy.readthedocs.io/).

    The source files can be found in the docs directory.

    ### Version 1.6.6 (Jun 09, 2021) * Fix error in Path.smooth not finding _hobby function. * Allow precision specification in SVG output.

    ### Version 1.6.5 (Jun 08, 2021) * Support GDSII files with 0-padding at the end. * Allow fixing and modifying GDSII file timestamps. * Thanks Troy Tamas and Joaquin Matres for the fixes

    ### Version 1.6.4 (Apr 23, 2021) * Fix missing module import (thanks Troy Tamas for the fix).

    ### Version 1.6.3 (Dec 28, 2020) * Fix bounding box edge case (thanks Troy Tamas for the fix).

    ### Version 1.6.2 (Dec 18, 2020) * More efficient bounding box calculation (thanks to Troy Tamas for the contribution). * Fix Label creation bug.

    ### Version 1.6.1 (Oct 22, 2020) * Fix SVG output when Label contains special characters.

    ### Version 1.6 (Aug 12, 2020) * Added support for element properties. * Added transformation support to Cell.copy . * Layer/datatype filtering in get_polygons for Cell , CellReference and CellArray . * Layer/datatype filtering in LayoutViewer . * Removed global cache _bounding_boxes . Only cells cache their bounding boxes. * Bug fixes (thanks Daniel Hwang for the contributions). * Bug fix in Cell.copy where the whole dependency tree would be copied on a deep copy creation.

    ### Version 1.5.2 (Feb 01, 2020) * Added support for importing GDSII files containing BOX elements. * Bug fix in GdsLibrary.extract (thanks collineps for finding the problem).

    ### Version 1.5 (Dec 20, 2019) * New Cell.write_svg function to export an SVG image of the cell. * New GdsLibrary.new_cell function to quickly create and add cells to a library. * GdsLibrary.add can update references when a cell is overwritten. * Added GdsLibrary.remove to allow cells to be properly removed from libraries. * Added GdsLibrary.rename_cell to rename cells in libraries. * Added GdsLibrary.replace_references to easily replace referenced cells in libraries. * GdsLibrary.add can add dependencies recursively. * Iterating over GdsLibrary objects yields all its cells. * Iterating over Cell objects yield all its polygons, paths, labels and references. * Breaking change to *.to_gds functions in order to improve write efficiency (this should not be a problem for most users, since gdspy.write_gds and Cell.write_gds remain the same). * Breaking change: renamed GdsLibrary.cell_dict to GdsLibrary.cells . * Deprecated: gdspy.current_library , gdspy.write_gds , gdspy.fast_boolen , GdsLibrary.extract . * Bug fixes and better tests for FlexPath and RobustPath .

    ### Version 1.4.3 (Nov 11, 2019) * Bug fix for FlexPath and RobustPath references.

    ### Version 1.4.2 (Oct 01, 2019) * Bug fix in FlexPath .

    ### Version 1.4.1 (Sep 20, 2019) * Bug fixes (thanks to DerekK88 and Sequencer for the patches).

    ### Version 1.4 (May 18, 2019) * Revised [documentation](http://gdspy.readthedocs.io/). * New FlexPath and RobustPath classes: more efficient path generation when using the original GDSII path specification. * New Curve class: SVG-like polygon creation. * Added PolygonSet.mirror (thanks to Daan Waardenburg for the contribution). * Added Path.bezier to create paths based on Bézier curves. * Added Path.smooth to create paths based on smooth interpolating curves. * Added get_gds_units to get units used in a GDSII file without loading. * Added get_binary_cells to load only the binary GDSII representation of cell from a file. * Added argument tolerance to Round , Path.arc , Path.turn , and Path.parametric to automatically control the number of points in the final polygons. * Added argument binary_cells to GDSII writing functions to support get_binary_cells . * Added argument rename_template to GdsLibrary.read_gds for flexible cell renaming (thanks to @yoshi74ls181 for the contribution). * Changed return value of slice to avoid creating empty PolygonSet . * Added argument timestamp to GDSII writing functions. * Improved Round to support creating ellipses. * Added support for unlimited number of points per polygon. * Added support for BGNEXTN and ENDEXTN when reading a GDSII file. * Polygon creation warnings are now controlled by poly_warnings . * Incorrect anchor in Label now raises an error, instead of emitting a warning. * Added correct support for radius in PolygonSet.fillet on a per-vertex basis. * Speed improvements in GDSII file generation (thanks to @fbeutel for the contribution) and geometry creation. * Font rendering example using [matplotlib](https://matplotlib.org/) (thanks Hernan Pastoriza for the contribution). * Expanded test suite.

    ### Version 1.3.2 (Mar 14, 2019) * Small fix for building on Mac OS X Mojave.

    ### Version 1.3.1 (Jun 29, 2018) * PolygonSet becomes the base class for all polygons, in particular Polygon and Rectangle . * Added Cell.remove_polygons and Cell.remove_labels functions to allow filtering a cell contents based, for example, on each element’s layer. * Added PolygonSet.scale utility method. * Added PolygonSet.get_bounding_box utility method. * Added argument timestamp to Cell.to_gds , GdsLibrary.write_gds and GdsWriter . * Added unit and precision arguments to GdsLibrary initialization and removed from its write_gds method. * Changed the meaning of argument unit in GdsLibrary.read_gds . * Improved slice to avoid errors when slicing in multiple positions at once. * Improved PolygonSet.fracture to reduce number of function calls. * Removed incorrect absolute flags for magnification and rotation in CellReference and CellArray . * Minor bug fixes. * Documentation fixes. * Removed deprecated classes and functions.

    ### Version 1.2.1 (Dec 5, 2017) * GdsLibrary can be created directly from a GDSII file * Added return value to GdsLibrary.read_gds * Fixed return value of GdsLibrary.add

    ### Version 1.2 (Oct 21, 2017) * Added new gdsii_hash function. * Added precision parameter to _chop , Polygon.fracture , Polygon.fillet , PolygonSet.fracture , PolygonSet.fillet , and slice . * Included labels in flatten operations (added get_labels to Cell , CellReference , and CellArray ). * Fixed bug in the bounding box cache of reference copies. * Fixed bug in _chop that affected Polygon.fracture , PolygonSet.fracture , and slice . * Other minor bug fixes.

    ### Version 1.1.2 (Mar 19, 2017) * Update clipper library to 6.4.2 to fix bugs introduced in the last update. * License change to Boost Software License v1.0.

    ### Version 1.1.1 (Jan 27, 2017) * Patch to fix installation issue (missing README file in zip).

    ### Version 1.1 (Jan 20, 2017) * Introduction of GdsLibrary to allow user to work with multiple library simultaneously. * Deprecated GdsImport in favor of GdsLibrary . * Renamed gds_print to write_gds and GdsPrint to GdsWriter . * Development changed to Python 3 (Python 2 supported via [python-future](http://python-future.org/)). * Added photonics example. * Added test suite. * Clipper library updated to last version. * Fixed inside function sometimes reversing the order of the output. * Fixed rounding error in fast_boolean . * Fixed argument deep_copy being inverted in Cell.copy . * Bug fixes introduced by numpy (thanks to Adam McCaughan for the contribution).

    ### Version 1.0 (Sep 11, 2016) * Changed to “new style” classes (thanks to Adam McCaughan for the contribution). * Added a per-point radius specification for Polygon.fillet (thanks to Adam McCaughan for the contribution). * Added inside fucntion to perform point-in-polygon tests (thanks to @okianus for the contribution). * Moved from distutils to setuptools for better Windows support.

    ### Version 0.9 (Jul 17, 2016) * Added option to join polygons before applying an offset . * Added a translate method to geometric entities (thanks John Bell for the commit). * Bug fixes.

    ### Version 0.8.1 (May 6, 2016) * New fast_boolean function based on the [Clipper](http://www.angusj.com/delphi/clipper.php) library with much better performance than the old boolean . * Changed offset signature to also use the [Clipper](http://www.angusj.com/delphi/clipper.php) library (this change breaks compatibility with previous versions). * Bug fix for error when importing some labels from GDSII files.

    ### Version 0.7.1 (June 26, 2015) * Rebased to GitHub. * Changed source structure and documentation.

    ### Version 0.7 (June 12, 2015) * New feature: offset function. * New GdsPrint class for incremental GDSII creation (thanks to Jack Sankey for the contribution).

    ### Version 0.6 (March 31, 2014) * Default number of points for Round , Path.arc , and Path.turn changed to resolution of 0.01 drawing units. * Path.parametric accepts callable final_distance and final_width for non-linear tapering. * Added argument ends to PolyPath . * Added (limited) support for PATHTYPE in GdsImport . * A warning is issued when a Path curve has width larger than twice its radius (self-intersecting polygon). * Added a random offset to the patterns in LayoutViewer . * LayoutViewer shows cell labels for referenced cells. * get_polygons returns (referenced) cell name if depth < 1 and by_spec is True. * Bug fix in get_bounding_box when empty cells are referenced. * Bug fixes in GdsImport and many speed improvements in bounding box calculations (thanks to Gene Hilton for the patch).

    ### Version 0.5 (October 30, 2013) - NOT COMPATIBLE WITH PREVIOUS VERSIONS * Major LayoutViewer improvements (not backwards compatible). * The layer argument has been repositioned in the argument list in all functions (not backwards compatible). * Renamed argument by_layer to by_spec (not backwards compatible). * Error is raised for polygons with more vertices than possible in the GDSII format. * Removed the global state variable for default datatype. * Added get_datatypes to Cell . * Added argument single_datatype to Cell.flatten . * Removed gds_image and dropped the optional PIL dependency.

    ### Version 0.4.1 (June 5, 2013) * Added argument axis_offset to Path.segment allowing creation of asymmetric tapers. * Added missing argument x_reflection to Label . * Created a global state variable to override the default datatype. * Bug fix in CellArray.get_bounding_box (thanks to George McLean for the fix)

    ### Version 0.4 (October 25, 2012) * Cell.get_bounding_box returns None for empty cells. * Added a cache for bounding boxes for faster computation, especially for references. * Added support for text elements with Label class. * Improved the emission of warnings. * Added a tolerance parameter to boolean . * Added better print descriptions to classes. * Bug fixes in boolean involving results with multiple holes.

    ### Version 0.3.1 (May 24, 2012) * Bug fix in the fracture method for PolygonSet .

    ### Version 0.3a (May 03, 2012) * Bug fix in the fracture method for Polygon and PolygonSet .

    ### Version 0.3 (April 25, 2012) * Support for Python 3.2 and 2.7 * Further improvements to the boolean function via caching. * Added methods get_bounding_box and get_layers to Cell . * Added method top_level to GdsImport . * Added support for importing GDSII path elements. * Added an argument to control the verbosity of the import function. * Layer -1 (referenced cells) sent to the bottom of the layer list by default in LayoutViewer * The text and background of the layer list in LayoutViewer now reflect the colors of the outlines and canvas backgroung. * Changed default background color in LayoutViewer * Thanks to Gene Hilton for the contributions!

    ### Version 0.2.9 (December 14, 2011) * Attribute Cell.cell_list changed to Cell.cell_dict . * Changed the signature of the operation in boolean . * Order of cells passed to LayoutViewer is now respected in the GUI. * Complete re-implementation of the boolean function as a C extension for improved performance. * Removed precision argument in boolean . It is fixed at 1e-13 for merging close points, otherwise machine precision is used. * gds_image now accepts cell names as input. * Added optional argument depth to get_polygons * Added option to convert layers and datatypes in imported GDSII cells. * Argument exclude_layers from LayoutViewer changed to hidden_layers and behavior changed accordingly. * Shift + Right-clicking on a layer the layer-list of LayoutVIewer hides/unhides all other layers. * New buttons to zoom in and out in LayoutViewer . * Referenced cells below a configurable depth are now represented by theirs bounding boxes in LayoutViewer .

    ### Version 0.2.8 (June 21, 2011) * GDSII file import * GDSII output automatically include required referenced cells. * gds_print also accepts file name as input. * Outlines are visible by default in LayoutViewer . * Added background color option in LayoutViewer . * Right-clicking on the layer list hides/unhides the target layer in LayoutViewer . * Cell.cell_list is now a dictionary indexed by name, instead of a list. * Added option to exclude created cells from the global list of cells kept in Cell.cell_list . * CellReference and CellArray accept name of cells as input. * Submodules lost their own __version__ .

    ### Version 0.2.7 (April 2, 2011) * Bug fixed in the boolean , which affected the way polygons with more vertices then the maximum were fractured. * gds_image accepts an extra color argument for the image background. * Screenshots takes from LayoutViewer have the same background color as the viewer. * The functions boolean and slice now also accept CellReference and CellArray as input. * Added the method fracture to Polygon and PolygonSet to automatically slice polygons into parts with a predefined maximal number of vertices. * Added the method fillet to Polygon and PolygonSet to round corners of polygons.

    ### Version 0.2.6 (February 28, 2011) * When saving a GDSII file, ValueError is raised if cell names are duplicated. * Save screenshot from LayoutViewer . * gds_image accepts cells, instead of lists. * Outlines supported by gds_image . * LayoutViewer stores bounding box information for all visited layers to save rendering time.

    ### Version 0.2.5 (December 10, 2010) * Empty cells no longer break the LayoutViewer. * Removed the gds_view function, superseded by the LayoutViewer, along with all dependencies to matplotlib. * Fixed a bug in boolean which affected polygons with series of collinear vertices. * Added a function to slice polygons along straight lines parallel to an axis.

    ### Version 0.2.4 (September 04, 2010) * Added shortcut to Extents in LayoutViewer: Home or a keys. * PolygonSet is the new base class for Round , which might bring some incompatibility issues with older scripts. * Round elements, PolyPath , L1Path , and Path arc , turn and parametric sections are now automatically fractured into pieces defined by a maximal number of points. * Default value for max_points in boolean changed to 199. * Removed the flag to disable the warning about polygons with more than 199 vertices. The warning is shown only for Polygon and PolygonSet . * Fixed a bug impeding parallel parametric paths to change their distance to each other.

    ### Version 0.2.3 (August 09, 2010) * Added the PolyPath class to easily create paths with sharp corners. * Allow None as item in the colors parameter of LayoutViewer to make layers invisible. * Added color outline mode to LayoutViewer (change outline color with the shift key pressed) * Increased the scroll region of the LayoutViewer canvas * Added a fast scroll mode: control + drag 2nd mouse button * Created a new sample script

    ### Version 0.2.2 (July 29, 2010) * Changed the cursor inside LayoutViewer to standard arrow. * Fixed bugs with the windows version of LayoutViewer (mouse wheel and ruler tool).

    ### Version 0.2.1 (July 29, 2010) * Bug fix: gds_image displays an error message instead of crashing when PIL is not found. * Added class LayoutViewer , which uses Tkinter (included in all Python distributions) to display the GDSII layout with better controls then the gds_view function. This eliminates the matplotlib requirement for the viewer functionality. * New layer colors extending layers 0 to 63.

    ### Version 0.2.0 (July 19, 2010) * Fixed a bug on the turn method of Path . * Fixed a bug on the boolean function that would give an error when not using Polygon or PolygonSet as input objects. * Added the method get_polygons to Cell , CellReference and CellArray . * Added a copy method to Cell . * Added a flatten method to Cell to remove references (or array references) to other cells. * Fracture boolean output polygons based on the number of vertices to respect the 199 GDSII limit.

    ### Version 0.1.9 (June 04, 2010) * Added L1Path class for Manhattan geometry (L1 norm) paths.

    ### Version 0.1.8 (May 10, 2010) * Removed the argument fill from gds_view and added a more flexible one: style . * Fixed a rounding error on the boolean operator affecting polygons with holes. * Added a rotate method to PolygonSet . * Added a warning when PolygonSet has more than 199 points * Added a flag to disable the warning about polygons with more than 199 points. * Added a turn method to Path , which is easier to use than arc . * Added a direction attribute to Path to keep the information used by the segment and turn methods.

    ### Version 0.1.7 (April 12, 2010) * New visualization option: save the geometry directly to an image file (lower memory use). * New functionality added: boolean operations on polygons (polygon clipping). * All classes were adapted to work with the boolean operations. * The attribute size in the initializer of class Text does not have a default value any longer. * The name of the argument format in the function gds_view was changed to fill (to avoid confusion with the built-in function format ).

    ### Version 0.1.6 (December 15, 2009) * Sample script now include comments and creates an easier to understand GDSII example. * Improved floating point to integer rounding, which fixes the unit errors at the last digit of the precision in the GDSII file. * Fixed the font for character 5. * Added a flag to gds_view to avoid the automatic call to matplotlib.pyplot.show() . * In gds_view , if a layer number is greater than the number of formats defined, the formats are cycled.

    ### Version 0.1.5a (November 15, 2009) * Class Text correctly interprets n and t characters. * Better documentation format, using the Sphinx engine and the numpy format.

    ### Version 0.1.4 (October 5, 2009) * Class Text re-written with a different font with no overlaps and correct size.

    ### Version 0.1.3a (July 29 2009) * Fixed the function to_gds of class Rectangle .

    ### Version 0.1.3 (July 27, 2009) * Added the datatype field to all elements of the GDSII structure.

    ### Version 0.1.2 (July 11, 2009) * Added the gds_view function to display the GDSII structure using the matplotlib module. * Fixed a rotation bug in the CellArray class. * Module published under the GNU General Public License (GPL)


    Window System and Platform Integration

    TOM McREYNOLDS , DAVID BLYTHE , in Advanced Graphics Programming Using OpenGL , 2005

    7.3 Anatomy of a Window

    In its simplest form a window is a rectangular region on a display, 1 described by an origin and a size. From the rendering perspective, the window has additional attributes that describe its framebuffer, color index vs. RGB, number of bits per-pixel, depth buffer size, accumulation buffer size, number of color buffers (single buffered or double buffered), and so forth. When the OpenGL context is bound to the window, these attributes are used by the OpenGL renderer to determine how to render to it correctly.

    7.3.1 Overlay and Underlay Windows

    Some window systems include the concept of an overlay window. An overlay window always lies on top of non-overlay windows, giving the contents of the overlay window visual priority over the others. In some window systems, notably the X Window System, the overlay window may be opaque or transparent. If the overlay window is opaque, then all pixels in the overlay window have priority over pixels in windows logically underneath the overlay window (below it in the window stacking order). Transparent overlay windows have the property of controlling the visual priority of a pixel using the overlay pixel's color value. Therefore, pixels assigned a special transparent color have lower priority, so the pixel of the window logically underneath this window can be visible.

    Overlay windows are useful for implementing popup menus and other graphical user interface components. They can also be useful for overlaying other types of annotations onto a rendered scene. The principal advantage of using an overlay window rather than drawing directly into the main window is that the two windows can be updated independently—to change window annotations requires redrawing only the overlay window. This assumes that overlay window independence is really implemented by the window system and display hardware, and not simulated. 2 Overlays become particularly useful if the contents of the main window are expensive to regenerate. Overlay windows are often used to display data decoded from a multimedia video source on top of other windows with the hardware accelerator decoding the stream directly to a separate overlay framebuffer.

    Similar to the concept of overlays, there is the analogous concept of an underlay window with the lowest visual priority. Such a window is only useful when the windows logically above it contain transparent pixels. In general, the need for underlay windows has been minimal there are few OpenGL implementations that support them.

    7.3.2 Multiple Displays

    Some operating system/window system combinations can support multiple displays. Some configure multiple displays to share the same accelerator or, in the more general case, multiple accelerators each drive multiple displays. In both cases the details of attaching and using a context with windows on different displays becomes more complicated, and depends on window system embedding details.

    Figure 7.2 shows an example of a three-display system in which two of the displays are driven from one graphics accelerator, while a third display is driven from a second accelerator. To use all of the available displays typically involves the use of multiple OpenGL contexts. Since an OpenGL context encapsulates the renderer state, and this state may be contained inside hardware, it follows that each hardware accelerator needs an independent context. If the two accelerators were built by different vendors, they would likely use two different OpenGL implementations. A well-designed operating system/window system embedding layer can allow both accelerators to be used from a single application by creating a context corresponding to each accelerator/display combination. For example, in a GLX-based system, the accelerator is identified by its DISPLAY name an application can create GLX contexts corresponding to the individual DISPLAY names.

    Figure 7.2 . Three-display system using two accelerators.

    Multiple display systems go by many different names (multimon, dual-head, Twin-View are examples) but all are trying to achieve the same end. They all drive multiple displays, monitors, or video channels from the same accelerator card. The simplest configuration provides a large logical framebuffer from which individual rectangular video or display channels are carved out (as shown in Figure 7.3 ). This is similar to the way windows are allocated from a framebuffer on a single display. The amount of framebuffer memory used by each channel depends on the resolution of the channel and the pixel formats supported in each channel.

    Figure 7.3 . Two video channels allocated from one framebuffer.


    Different annotations tools

    SuperAnnotate

    SuperAnnotate is an AI-powered image and video annotation platform. It has a partnership with OpenCV for its desktop version.

    • Allows users to create high-quality training datasets providing annotations for computer vision tasks.
    • Design projects work and distribute tasks among teams.
    • Building large projects at scale.
    • Using active learning to accurately annotate images.
    • Annotations automation for predefined classes.
    • Transfer learning to predict new classes.
    • Use of QA automation to detect mislabeled annotations.
    • Viewing analytics to keep track of annotation speed, quality.

    LabelBox

    Labelbox is an enterprise-grade platform providing solutions for training data with AI-enabled labeling tools for both image and text data, enabling labeling automation, integrating the human workforce, and data management. Has accessibility to a powerful API, along with Python SDK for extensibility.

    • Best suited for commercial solutions with the features for creating and maintaining high-quality training data.
    • Labeling tools for images, video text, and geospatial data.
    • A standardized way for organizations to collaborate on the creation, manage, and review of data.
    • Automation labeling to reduce costs, enhance the speed with QA.
    • The external labeling service to support and maintain data quality with an internal labeling team.

    Playment

    Playment helps ML teams build high-quality training data with ML-assisted tools, structured project management systems, expert human workforce, and much more. Provides solutions in image, video, and sensor annotation along with API integration to ML pipelines, and GT Studio.

    • Has the best-in-class annotations for Lidar and Radar.
    • A standardized way to manage high-quality training data for computer vision tasks.
    • Has a Ground-truth Studio to serve data labeling for creating diverse, high-quality ground truth datasets at scale
    • Streamline data pipelines to enable faster development of AI systems.
    • Auto-scaling Workforce.
    • Provisions for customized use cases.

    Clarifai

    Clarifai is one of the leading data annotation platforms providing developers, data scientists, and enterprises with deep learning tools to build entire AI lifecycles for various products and use-cases.

    • Workflow management
    • API integration
    • Wide range of computer vision and NLP tasks across various industries
    • Provisions for custom and pre-trained AI models
    • Nominal pricing as per usage
    • Scalable deployment
    • User-friendly UI/UX
    • Quality assurance by professionals

    Datasaur

    Datasaur is one of the best text annotation platforms providing AI-based solutions to extract, analyze, maintain, and modify text data.

    • Datasaur uses NLP along with other ML-assisted tools to build high-quality training text data.
    • Can detect misclassified content using automation tools
    • Provide summarization and analysis
    • Free usage up to 5000 labels per month with 100MB storage
    • Optimized labeling interface, Fully programmatic project creation and export via API, Regular Expression extension, Automatic file converter, Data validation, and review.
    • Team Management, Performance Dashboard, Data Privacy, Cloud sync

    Lightly

    Lightly uses one of the eminent deep learning algorithms called self-supervised learning techniques to enhance data labeling. It can improve ML models with its tools for data preparation and curation for vision data.

    • Can perform image classification and image segmentation
    • On-premise Docker service to store, manage and work efficiently
    • Has both web app and Python API interfaces
    • Build on top of PyTorch library.
    • Performance measures of datasets through graph analysis
    • Active feedback and support
    • Free services up to 5000 private and 25000 public images

    Hive provides enterprise AI solutions for industry-specific use-cases. Used in both computer vision and NLP tasks. Hive believes in an AI-as-a service platform.

    • Data labelling by categorizing
    • Entire workflow management with constant feedback and support until the final production
    • Hive predict is Model-as-a-service providing predictions on visuals, audio, and text data
    • Training data is customizable, flexible, and built with proper high-quality assurance.

    To know more visit -> Hive

    Lionbridge

    Lionbridge deals with all kinds of data Image, Video, Audio, Text, and Geospatial data for providing annotation and labeling services. It is one of the oldest companies in the market.

    • Its text annotation has multilingual services covering many languages across the globe.
    • Provides entire service from data collection to validation.
    • Has open access to 300+ datasets
    • Follows human-in-loop annotation format by crowdsourcing
    • AI consulting
    • Partnered and trusted by fortune 500 companies

    V7 Darwin

    V7 labs had launched V& darwin platform for data annotation and data labeling purposes. Darwin makes use of deep learning algorithms to generate state-of-the-art high- quality ground truth datasets.

    • End to end services for computer vision tasks.
    • Automated image annotation
    • Use of active learning for training datasets
    • Allows team collaboration and data visualization
    • API and CLI tools availability along with Python SDK
    • Complete model training pipeline
    • Quality Review during the entire product lifecycle

    Amazon Sagemaker Ground Truth

    AWS as we all know is a leading cloud service provider. Amazon Sagemaker Ground Truth is one of its products used for data labeling to generate ground truth datasets using the machine learning platform Amazon Sagemaker.

    DeepMind Found New Approach To Create Faster Reinforcement Learning Models
    • Sagemaker GT can be integrated with Amazon Mechanical Turk
    • Labelling goes through various processes assisted labelling by external and internal labellers
    • Label verification, adjustment, and validation
    • Flexible pricing
    • Datasets are stored in S3(Amazon simple storage service) buckets
    • Amazon CLI to download the annotated dataset

    LightTag

    LightTag is another text annotation platform providing faster NLP services.

    • Allow designation allotments for various tasks distributions in data annotation
    • Multilingual
    • Performance dashboard for both data and annotators
    • Evaluation metrics
    • Automation
    • Review & QA.

    Kili Technology

    Kili technology covers all the multimedia data for annotation and labelling at industry-specific levels.

    • computer vision (image, video) or on NLP (text, pdf, voice) topics
    • Allowance for on-boarding business experts & external workforce to scale projects.
    • simple collaboration, quality control, data management, and labeling workforce
    • Available online or on-premise
    • ML with active learning, online learning, and semi-supervised learning
    • Python Client GraphQL API

    To know more visit -> Kili Technology

    Dataturks

    Dataturks is an AI startup later acquired by Walmart Labs. It helps developers and researchers in annotating an image, video, and text data.

    • Open source datasets are available
    • Generates real-time reports
    • Enables crowdsourcing
    • Has open-sourced GitHub repo
    • Software support in Linux and Windows
    • Complete API service to upload, process, and download data

    TagTog

    TagTog is another self-supervised text annotation tool.

    • NLP modeling
    • Text analytics, visualization, and annotation
    • SMEs with domain-specific insights
    • Provides moderation and customization
    • Access to pre-annotated data
    • Multilingual
    • Unicode support
    • Multiple format support ( PDF, CSV, etc)
    • Python and JavaScript API

    LinkedAI

    LinkedAI is a no-code AI-assisted mostly for computer vision annotation platform but also offers NLP services.

    • Data labelling, and Data tagging
    • generating synthetic data
    • Quality checks by professionals
    • Auto labelling services
    • Crowdsourcing
    • Annotations available in JSON and CSV

    Choose The suitable Data Annotator Tool

    Tool NameServices Provided/ToolsSolutions/ Use Cases
    SuperAnnotateImage & VideoBounding boxes, Polylines, polygons, Cuboid, Ellipse, Line, PointAerial Imaging, Autonomous Driving, Retail, Security & Surveillance, Medical, Robotics.
    LabelBoxImage, Video, Text, Geospatial data.bounding box, Points, superpixel, brush, eraser, polylines, Polygons, NERDocument data extraction, manufacturing, health, insurance, aerial, agriculture, transportation
    PlaymentImage, Video, Sensor2D & 3Dbounding box, polygons, cuboid, polylines, landmark, semantic & point cloud segmentation, 2D-3D object linkingAutonomous Vehicles, Human Pose Estimation and Tracking, Security surveillance, insurance, fashion, gaming, agriculture
    ClarifaiImage, Video, Text.Single and Multilabel classification, bounding box, polyline, video tracking, NER, OCR, text moderationE-commerce, hospitality, document analysis, user content monitoring, chatbots, aviation, tourism, OTT platforms, insurance, public sector, brick & mortar
    DatasaurNamed Entity Recognition, Part-of-speech, Coreference Resolution,Dependency Resolution,Document Labelling, OCR Finance, Healthcare, Legal, Media, E-commerce
    LightlyImage and VideoData augmentation, semantic segmentationAutonomous Vehicles, Visual Inspection, Medical Imagery, Geospatial Data
    HiveImage, audio, video, textbounding boxes, polygons, semantic segmentation, cuboids, key points, lines, principal axes rotation, timestamp, contours, transcriptionsLogo identification, content moderation, document parsing, retail, advertisement, automotive, hospitality, speech to text,
    Lionbridge2D & 3D bounding boxes, cuboids, Image Classification/Image Categorization, Landmark Annotation, Pixel-precise / Pixel-wise Segmentation, Polygons, Semantic Segmentation, Grammar and Spelling, Machine translation Quality Assurance, Indent VariationAR/VR, Drones and aerial imagery, Autonomous Vehicles, Car infotainment, Face Recognition, Medical Imagery, Video Data analysis, Social Media, Robotics, Analytics and visualization.Sentiment analysis, entity extraction, Automatic Speech Recognition, Voice assistants, Text-to-Speech, pronunciation dictionary creation, Sales Call Analysis, Point of interest tagging, address verification, car and pedestrian routing,
    V7 DarwinImage & Videopolygon, brush and eraser, bounding boxes, key points, line, ellipse, cuboid, classification tags, attributes, instance tags, directional vectorsVision AI for visually impaired, Retail, life sciences, environment, manufacturing.
    Amazon Sagemaker GTImage, Video and textImage Classification, Object Detection, and Semantic Segmentation, multi-frame object classification, object tracking, and video clip classification, 3D point clouds, Entity extractionautonomous vehicles, product descriptions, movie reviews or sentiment analysis
    LightTagtextSpan Annotation,Entity Annotations, Relationships Annotation.Phrase and Subword Annotations, Document Metadata, Pre-Annotations, Keyboard Shortcuts.Document Classifications, Document Tagging, Very Long Class Lists, Guidelines,Auto Save, Search.Finance, legal, medical.
    Kili TechnologyImage, video, audio and textpoints, polyline, polygon, bounding boxes, and segmentationobject detection, OCR, entity extractionImage classification, Medical Imagery, Audio transcription, Conversational Bot
    DataturksImage, video and textimage classification and segmentation, object detection using polygons and bounding boxes, OCR, Document Annotation, Sublabels, NER, PoSText Summarization, Content Moderation, Image Label generation
    TagTogTextentity extraction, entity normalisation, concept search, Big Texts, annotated corpus, semantic search, text mining, Chatbot Training, business intelligence, and CRM data enrichment
    LinkedAIImage, Video & textbounding boxes, polygons, lines, semantic segmentation and landmarksImage categorization, automation vehicle, face recognition systems
    Join Our Telegram Group. Be part of an engaging online community. Join Here.

    Geospatial Data Extraction

    We are experiencing intermittent technical problems with the Geospatial Data Extraction tool.

    We don't know the nature of the problem yet, but the Mozilla Firefox browser seems to work normally.

    Rest assured that we are working to restore the services as quickly as possible. Meanwhile, our main collections CanVec and the Digital Elevation Model (DEM) datasets are still available on Open Maps portal.

    We apologize for the inconvenience this situation may cause. If you continue to experience technical problems, please contact the customer service team at [email protected] and we will be happy to help you.

    Client Services / Service à la clientele

    Natural Resources Canada / Ressources naturelles Canada

    Government of Canada / Gouvernement du Canada

    Welcome to the geospatial data extraction tool!

    This slide deck will guide you through all options available and hopefully help you find what you are looking for.

    The purpose of this application is to provide tailored geospatial dataset based on your needs. Here are the basic steps to extract data:

    1. Select which data product to clip
    2. Find and select the clipping area
    3. Fill the extraction form and submit it
    4. Receive email and download your package

    Select data to be extracted

    This section list all data product available for a dynamic data extraction process. If you hover your mouse over the link you will see a description of the data product.

    When you click on a link, the interface will switch to the Select clipping area section while a data extraction form is built to be presented to you in the Select options and submit job section..

    ** Note that some data products (like Automatic Extraction Data) do not cover all the Canadian territory. When you select such a product, a layer representing its availability limits will be automatically displayed on the map. You have to make sure that your clipping area overlaps that data limit layer. You will be able to adjust the opacity of this layer in the Overlay reference layers section.

    Find your clipping area

    There are many ways to find your area of interest. One of them is the Find a location section.

    You can search for the following features:

    • Street addresses
    • Street names
    • Streets intersection
    • Place names such as towns, villages, municipalities, parks
    • Natural geographical features such as lakes, islands, rivers, mountains
    • Postal codes (FSA code only - first three characters)
    • Map numbers from the National Topographic System (NTS)

    In the following example, a search for yellowknife was made. The results are presented in a table underneath the search box. Each result has a "Zoom to" button. When pressed, the map extent will be updated to show the selected area.

    Another way to find your area of interest is to simply zoom in, zoom out and pan the map.

    Overlay reference layer(s)

    The overlay reference layers are there to provide contextual information to help you localize your clipping area and / or better delineate it.

    All overlays are transparent at first. Moving the slider from left to right will increase the opacity of the overlay until it reaches a 100%.

    Select your clipping area

    Once you have found your area of interest you need to select the clipping area. The clipping area is defined by a series of geographic coordinates that make up a bounding box or a polygon.

    Four options are available to help you select the clipping area that best suits your need.

    • Current Map Extent
    • Predefined Clipping Area
    • Custom Clipping Area
    • Area from a Shapefile

    The fifth option lets you remove your current selection.

    Select your clipping area - Current Map Extent

    If you select the current map extent option an orange bounding box will cover the entire extent of the map.

    For example, if you use the Find a location tool and search for Fogo, zoom to the town and select the current map extent your map will look like this.

    Select your clipping area - Predefined Clipping Area

    Predefined clipping areas are a puzzeled view of the map. In other word, they each represent a different way to partition the territory.

    You can choose from 3 types of partition or tiling system:

    1. Drainage areas
    2. Landsat image footprints
    3. Map sheets from the National Topographic System (NTS)

    A list of tiles will be displayed under the dropdown menu. The list contains a maximum of 10 tiles that intersects the map extent.If there is more thant 10 tiles that intersects the map extent you will have the possibility to page through all the results.

    If you hover over the tile number with your mouse you will see a blue preview image of the tile. You can also select the tile by clicking on it and this will become your clipping area (orange).

    Select your clipping area - Custom Clipping Area

    Sometimes the other options are too much or not enough. With the custom clipping area tools you have complete control over the selection of the clipping area.

    You can choose from 3 custom options:

    The first two options are similar. After selecting one of them, click on the map to start your drawing. When you are drawing a rectangle drag the mouse over the map and release the button to finish. As for the polygon, click on the map to add points that make up the polygon. Double-click to finish drawing the polygon.

    Once you have completed your drawing four more options are available to modify your clipping area : reshape, rotate, resize and drag. They basically all work alike. After selecting one of them, the shape will change color and one or more orange circle will appear. Depending of the action selected click on the orange circle and reshape, rotate, resize or move the shape.

    The third options will display a bounding box on the map after you enter the lower left and upper right coordinates in the form.

    Select your clipping area - Area from a Shapefile

    A Shapefile is a vector data exchange format. It is composed of multiple files used to describe the format, the projection, the attribute and so on.

    This option only accepts Shapefile that contains polygons geometries. Your Shapefile has to be compressed in a ZIP file format before you upload the file.

    First you will be asked to choose the file you wish to use as a clipping area. Once your choice is made, the file will be uploaded and validated.

    1. it contains one or more polygons
    2. it has one of the following supported coordinate systems:
      • NAD83 UTM
      • NAD83 LAT/LON
      • NAD83 LAMBERT
    3. the total calculated area is less than the maximum allowed (may vary by product)

    then the Display on map button will be active. By clicking on it your shape will show up on the map.

    Select options and submit job

    This section is where you make choices to get the end product that best suits your need. Make sure all fields marked as required have been filled and submit your job.

    After submitting your job, a message will appear on the top of the application indicating if your request was successfully submitted in a green banner as opposed to a red banner if an error occurred while transmitting the request to the processing server.

    Since the data extraction process is not synchronous you will receive an email once your job is completed. The email will contain a download link to your data package and other related information on the product you extracted.

    Job status

    For each job successfully submitted a new row will be added in the table. Each row contains the ID of the job, its status and a refresh button. Every time the button is clicked a request is made to the processing server to check if the status has changed. Here is a table of the different status type and a short description:

    Status Description
    Submitted The job has been received by the server but has not been added to the job queue.
    Queued The job has been added to the job queue and is waiting to be processed.
    Success The job has completed and the transformation was successful.
    Processing error The job has completed, but a failure was reported while attempting to run the transformation.
    Server failure The server could not process the job.
    Processing The job has been pulled from the job queue and is being processed.

    This section offers contextual information to help you localize and select a clipping area. For the moment, there is only one basemap layer offered.

    All overlays are transparent at first. Moving the slider from left to right will increase the opacity of the overlay until it reaches a 100%.

    Basemap and reference overlay are not included in the extraction package.

    This section list all products from which it is possible to extract data.

    Choose one product and proceed to the following section "Select options and submit job". At any moment you can come back and choose another product.

    The basemap and reference overlay data are not included in the extraction package.

    A clipping area delineates a geographic region of interest. The application uses the clipping area to apply a spatial filter to return data entirely within this area.

    Several options are available to select a clipping area

    This option selects a bounding box that covers the extent of the map. To change the selected extent, zoom in / zoom out / pan and click the radio button again.

    This option proposes 3 types of tiling coverages via a delimitation service.

    • Drainage areas
    • Landsat image footprints
    • Map sheets from the National Topographic System (NTS)

    This option lets you draw and edit directly on the map a bounding box or a polygon. You can also enter a pair of geographic coordinates.

    With the rectangle drawing control active, click on the map and drag the mouse to get a rectangle. Release the mouse button to finish. Once the geometry is drawn you can resize, reshape, rotate and drag it.

    After activating this option, click on a feature's vertice (orange circle) and move it to the desired position. Release the mouse button to finish.

    After activating this option, click on the feature's rotation handle (orange circle on the bottom right) and rotate the feature to the desired angle. Release the mouse button to finish.

    After activating this option, click on the feature's resize handlre (orange circle on the bottom right) and resize the feature to the desired size. Release the mouse button to finish.

    After activating this option, click on the feature's drag handlre (orange circle in the middle) and drag the feature to the desired position. Release the mouse button to finish.

    With the polygon drawing control active, click on the map to add points that make up the polygon. Double-click to finish drawing. Once the geometry is drawn you can resize, reshape, rotate and drag it.

    After activating this option, click on a feature's vertice (orange circle) and move it to the desired position. Release the mouse button to finish.

    After activating this option, click on the feature's rotation handle (orange circle on the bottom right) and rotate the feature to the desired angle. Release the mouse button to finish.

    After activating this option, click on the feature's resize handlre (orange circle on the bottom right) and resize the feature to the desired size. Release the mouse button to finish.

    After activating this option, click on the feature's drag handlre (orange circle in the middle) and drag the feature to the desired position. Release the mouse button to finish.

    Enter the lower left and upper right coordinate in the appropriate fields. Once all the fields have been filled out, click on the "Display area on map" button to draw the bounding box on the map.

    It is now possible to add one or more polygons from a Shapefile as a clipping area.

    This new feature accepts a zip file of a Shapefile.

    Supported coordinate systems :

    If a clipping area has been previously selected, this option will remove the selection from the map and the extraction form.


    Search Developer's Guide — Chapter 14

    This chapter describes how to use the geospatial features of MarkLogic and describes the type of applications that might use these functions. MarkLogic supports geospatial data represented in either XML or JSON, and supports geospatial search in several languages, include XQuery, Server-Side JavaScript, Java, and Node.js.

    This chapter includes the following sections:

    Terms and Definitions

    You should be familiar with the following terms and definition before using geospatial features of MarkLogic Server:

    Term Definition
    coordinate system A geospatial coordinate system is a set of mappings that map places on Earth to a set of numbers. The vertical axis is represented by a longitude coordinate, and the horizontal axis is represented by a latitude coordinate. Together they make up a coordinate system that is used to map places on the Earth. For more details, see Understanding Geodetic Coordinates.
    distance The distance between two geospatial objects refers to the geographical closeness of those geospatial objects.
    ETRS89 ETRS89, or European Terrestrial Reference System 1989, is an earth-centered geodetic coordinate system. This is one of the coordinate systems you can use for computations, search and indexing of geospatial data. For details, see Multiple Coordinate Systems.
    point A geospatial point is a discrete location, identified by two coordinates. In a geodetic coordinate system, a point is identified by its latitude and longitude coordinates. For more details, see Understanding Points.
    point query A point query matches points in documents against point or other region search criteria. When the criteria are expressed as points, a document matches if a point in the document is equal to the input criteria. When the criteria are expressed as other region types, a document matches if a point in the document is within the input region. Use a region query to match non-point regions in documents.
    proximity The proximity of search results is how close matches are to each other in a document. Proximity can apply to any type of search terms, including geospatial search terms. For example, you might want to find the term dog within 10 words of a point in a given zip code.
    raw A Euclidean coordinate system. This is one of the coordinate systems you can use for computations, search and indexing of geospatial data. For details, see Multiple Coordinate Systems.
    region A region is a set of points that describe a point, box, circle, polygon, or linestring. For details, see Understanding Coordinate Systems.
    region query A region query matches regions in documents against region search criteria. A document matches if a region in the document satisfies a specified relationship with the input regions, such as overlaps, intersects, contains, or within. When searching for matching points, you should usually use a point query instead of a region query. For details, see Searching for Matching Regions.
    tolerance A distance within which two points are considered equal, a point is considered on an edge, or two edges are considered touching, even when the coordinate values do match exactly. For details, see Understanding Tolerance.
    WGS84 WGS84, or World Geodetic System version 1984, is an earth-centered geodetic coordinate system. This is one of the coordinate systems you can use for computations, search and indexing of geospatial data. For details, see Multiple Coordinate Systems.
    WKT WKT, or Well Known Text, is a common string representation of geospatial data. You can convert to and from WKT and the internal MarkLogic representation of a region or point. For details, see Converting To and From Common Geospatial Representations.
    WKB WKB, or Well Known Binary, is a common binary representation of geospatial data. You can convert to and from WKT and the internal MarkLogic representation of a region or point. For details, see Converting To and From Common Geospatial Representations.
    governing coordinate system The coordinate system/precision combination in effect during a geospatial operation. For details, see The Governing Coordinate System.

    Licensing Requirements for Geospatial Features

    You must have an Advanced Geospatial License Option to use the following geospatial features:

      The functions geo:complex-polygon-contains, geo:complex-polygon-intersects, geo.complexPolygonContains, geo.complexPolygonIntersects. Double precision coordinates, including wgs84/double, etrs89/double, and raw/double. cts:reverse-query or cts.reverseQuery with geospatial constraints. This is sometimes called geo alerting.

    No other geospatial features or capabilities in MarkLogic require the Advanced Geospatial License Option.

    Geospatial Features Overview

    This section provides a brief overview of key features of the geospatial capabilities of MarkLogic Server. Each topic includes pointers to deeper discussion of the feature. The following topics are covered:

    Search for Points, Polygons, and Other Regions

    In MarkLogic, you can construct searches based on either points (discrete locations) or regions (areas). A geospatial query can match points, polygons, and other regions in your documents against points, boxes, circles, polygons, complex polygons, and linestrings search criteria.

    You can compare points for equality to other points or for containment in regions. You can compare polygons and other regions using a rich set of topological operators that includes containment, overlap, and intersection.

    For example, you can use geospatial search in MarkLogic to find documents matching criteria such as the following:

      Match points against other points. For example, find documents containing this point. Match points within regions. For example, find documents containing a point within this circle. Match regions against each other: For example, find documents containing a polygon that overlaps this polygon, or find documents containing a region that intersects this linestring.

    Notice that the first two query types match points in documents. These are called point queries. You can only use a point query to test for equality to another point or containment within a region.

    To search for regions satisfying relationsips such as intersection, containment, and overlap, use a region query.

    Geospatial Type System

    The geospatial interfaces in MarkLogic operate on a geospatial type hierarchy based on point and region primitive types. The type system includes region subtypes for specific region types, such as circle, box, and linestring.

    The cts:point XQuery type and cts.point JavaScript type represents a point. Points are used as building blocks for the region types. The cts:region XQuery type and the cts.region JavaScript is the base type for all regions.

    The geospatial interfaces include constructors for creating points and all supported region types. For example, you can create a polygon value using the cts:polygon XQuery constructor or the cts.polygon JavaScript constructor.

    Multiple Coordinate Systems

    The geospatial data can be expressed in one of several coordinate systems, including WGS84, ETRS89, and raw. WGS84 and ETRS89 are earth-centered geodetic coordinate systems. Raw is a flat plane, cartesian coordinate system. For more details, see Supported Coordinate Systems.

    MarkLogic also supports both single and double precision coordinates for each coordinate system. The precision is coupled with the coordinate system in most contexts. For example, when constructing a geospatial point index, your choice of coordinate system includes both wgs84 (single precision) and wgs84/double (double precision).

    For details, see the following topics:

    Support for Common Geospatial Representations

    Many MarkLogic interfaces work with geospatial data in common formats, such as Well Known Text (WKT), Well Know Binary (WKB), KML, GML, and GeoJson.

    Flexible Data Layout

    Geospatial data in MarkLogic is stored in XML elements and/or attributes, and JSON properties. The coordinates of a point or region thus stored can be represented in several different ways. You can also identify the location of your geospatial data in several different ways.

    For point queries, you can specify the location of coordinates in your documents by XPath expression, XML element name, XML element attribute name, or JSON property name. In addition, the coordinates of a point can be either a single, compound value ("10.5 32.7") or separate latitude and longitude values.

    For region queries, specify the location of the region coordinates using an XPath expression. Region coordinates must be stored as WKT or serialized cts:region values.

    Coordinates can also be stored, indexed, and interpreted as either single or double precision values. The original precision is always preserved in your documents, but the configured precision determines the precision at which coordinates are indexed and interpreted during computations.

    Support for Single and Double Precision Coordinates

    You can evaluate geospatial queries and create geospatial indexes that interpret coordinates as either single (float) or double precision values. You should usually choose single precision, unless your application requires fine-grained accuracy (less than 1 meter).

    The default precision depends on your evaluation context. If you do nothing to explictly configure the precision of your App Server or evaluation context, then single precision is used.

    For details, see the following topics:

    Geospatial Computational Utility Functions

    MarkLogic provides a rich set of geospatial utility functions, including the following:

      Computing distance and bearing computations Counting region vertices Finding the point at which two arcs intersect Generating a set of bounding boxes that cover a region

    Geospatial Format Conversion Functions

    MarkLogic provides XQuery and JavaScript library modules to translate Metacarta, GML, KML, GeoRSS, and GeoJSON formats to MarkLogic primitive geospatial types.

    The functions in these libraries are designed to convert geospatial data in supported formats and convert it into primitive MarkLogic geospatial primitive types for use with geospatial query constructors and other geospatial operations.

    For more details, see the following topics:

    Support in Multiple APIs

    This chapter focuses on performing geospatial queries in MarkLogic Server using the cts:search XQuery function or cts.search Server-Side JavaScript function. You can also configure and use geospatial search with the following MarkLogic APIs:

    • Search API (XQuery or Server-Side JavaScript) see Appendix: Query Options Reference and Searching Using Structured Queries.
    • JSearch API (Server-Side JavaScript) see Creating JavaScript Search Applications and the examples in this chapter.
    • Client APIs for Node.js, Java, and REST see Creating Point Queries with the Client APIs and Creating Region Queries Using the Client APIs REST Management API (for creating and managing geospatial indexes) see the Monitoring MarkLogic Guide and MarkLogic REST API Reference.

    Understanding Coordinate Systems

    In its most basic form, geospatial data is a set of coordinates. The interpretation of the coordinates is based on a coordinate system. For example, a geodetic coordinate system inteprets the coordinates as latitude and longitude values applying to the surface of the earth.

    MarkLogic supports both geodetic and Euclidean coordinate systems.

    This section covers the following topics:

    Understanding Points

    A point represents a discrete location. In a geodetic coordinate system such as WGS84, a point represents a discrete location on the earth. In a Euclidean coordinate system such as raw, a point represents a discrete location in the Euclidean space.

    A point is represented by an ordered pair of numbers called coordinates. In a geodetic coordinate system, these numbers represent latitude and longitude values on the earth for more details, see Understanding Geodetic Coordinates. In a 2-dimensional Euclidean coordinate system, these numbers represent horizontal (x) and verical (y) values for more details, see Understanding Euclidean Coordinates.

    The cts:point XQuery type and cts.point JavaScript type represent a point in MarkLogic Server. Use the cts:point or cts.point constructor to construct a point from a pair of coordinates.

    Points are also to used define the other regions in MarkLogic Server, and constructor functions are available for these regions, such cts:box in XQuery or cts.polygon in JavaScript. To learn about supported region types, see Understanding MarkLogic Geospatial Region Types.

    Understanding Geodetic Coordinates

    A geodetic coordinate system maps points to locations on the Earth. MarkLogic supports geodetic coordinate systems such as WGS84 and ETRS89.

    The coordinates of a point in a geodetic coordinate system represent latitude and longitude positions on the Earth. A point has one latitude coordinate and one longitude coordinate. The latitude coordinate represents the north/south position of the point on the Earth. The longitude coordinate represents the east/west position of the point on the Earth.

    Point coordinates are expressed in decimal degrees. Distance is measured in units such as miles, feet, kilometers, and meters.

    In a geodetic coordinate system, the shortest distance between two points is a curve called a geodesic arc or simply a geodesic. (In a spherical coordinate system, a geodesic is the same as a great circle.) The edges of a polygon in a geodetic coordinate system are geodesics, not straight lines.

    Latitude values are in the range -90 to 90 degrees. The equator has latitude zero. Negative latitude values are south of the equator, with -90 at the south pole. Positive latitude values are north of the equator, with 90 at the north pole.

    Longitude values are in the range -180 to 180 degrees. The Prime Meridian has longitude 0. Negative longitude values span the 180 degrees west of the Prime Meridian. Positive longitude values span the 180 degrees east of the Prime Meridian.

    Understanding Euclidean Coordinates

    A Euclidean coordinate system maps points to locations on a two-dimensional Euclidean plane. The raw coordinate system in MarkLogic is a Euclidean coordinate system for more details, see Raw Coordinate System.

    A Euclidean coordinate can be used to represent non-Earth spatial data in local coordinate systems, such as for mathematical modeling or when projecting geographic points on to a flat plane.

    A point in the raw coordinate system is represented by an (x,y) value pair, where x represents the horizontal position on the plane and y represents the vertical position. The interpretation of the coordinates is application specific, as is the range of values.

    Point coordinates and distances in a Euclidean coordinate system are interpreted in an application-specific way. The units for x, y, and distance are assumed to be the same. The edges of a polygon are straight lines in a Euclidean coordinate system.

    Most of the geospatial interfaces and documentation in MarkLogic refer to the coordinates of a point or region using latitude and longitude terminology. However, when working with a raw coordinate system, the coordinates do not actually represent latitude and longitude values. Instead, latitude refers to the x coordinate and longitude refers to the y coordinate.

    Supported Coordinate Systems

    MarkLogic Server supports the following coordinate systems for geospatial data:

    WGS84 Coordinate System

    By default, MarkLogic Server uses the World Geodetic System version 1984 (WGS84) as the basis for geocoding. WGS84 is a widely accepted standard for global point representation. WGS84 is an earth-centered geodetic coordinate system with a coordinate system origin at the Earth's center of mass.

    WGS84 is widely used for mapping locations on the Earth, and is used by a wide range of services, including satellite services such as Global Positioning System (GPS) and Google Maps. There are other coordinate systems, some of which have advantages or disadvantages over WGS84. For example, some are more accurate in a given region, while others may be used historically in legacy data.

    ETRS89 Coordinate System

    The European Terrestrial Reference System (ETRS89) is an earth-centered, earth-fixed geodetic coordinate system, designed primarily for mapping locations in Europe.

    This coordinate system is fixed to the stable part of the Eurasian tectonic plate. As such, it is not subject to continental drift. ETRS89 and WGS84 coordinates are not interchangeable because of this difference in the handling of continental drift.

    Raw Coordinate System

    The raw coordinate is a Euclidean coordinate system.

    The coordinates of a point in the raw coordinate system represent a position on a two-dimensional Euclidean plane. For details, see Understanding Euclidean Coordinates.

    The raw coordinate system is a simple cartesian coordinate system, best suited for working with non-geospatial data. However, you can use the raw coordinate system to represent geographical points projected on to a flat plane.

    The Governing Coordinate System

    The governing coordinate system is the coordinate system/precision combination in effect during a geospatial operation. It affects the handling of input values, calculations, comparisons, and return values.

    A precision is always implied by the coordinate system name. For example, wgs84 implies single precision, while wgs84/double implies double precision. However, some operations accept a precision option that enables you to override the precision implicit in the coordinate system name. For details, see Specifying a Per-Operation Coordinate System and Precision.

    The governing coordinate system is based on a precedence ordering of the coordinate system and precision specified in the App Server configuration, a main module prolog (XQuery only), and the parameters or options of a geospatial function (from lowest to highest precedence). For details, see How MarkLogic Selects the Governing Coordinate System.

    How Precision Affects Geospatial Operations

    The governing coordinate system always has an associated precision, either float (single) or double. Some operations allow you to override the precision implied by the coordinate system through an option.

      The original precision of your data is always preserved in your documents. Geospatial data is indexed using the precision configured for the index. Geospatial points and regions are serialized at the precision of the governing coordinate system. Comparison operations on geospatial regions use the precision of the governing coordinate system. Functions operating on geospatial data interpret their input, perform their calculations, and return their results using the governing coordinate system. Functions that return geospatial points or regions return either single or double precision coordinates, depending on the governing coordinate system. Accessor functions for geospatial points or region return either a single or double precision value, depending on the governing coordinate system. This applies to XQuery functions such as cts:point-latitude, cts:circle-radius, cts:box-west, and their Server-Side JavaScript equivalents. Latitude and longitude bounds on box functions are not truncated to the single precision range if the governing coordinate system is double precision. This applies to XQuery functions such as cts:geospatial-boxes and cts:element-geospatial-boxes, and their Server-Side JavaScript equivalents. Geospatial operations perform calculations using the precision of the governing coordinate system. This applies to functions such as cts:distance, cts:polygon-contains, and cts:bounding-boxes, and their Server-Side JavaScript equivalents. The input pattern parameter to value-match functions can be either single or double precision, depending on the governing coordinate system. This applies to XQuery functions such as cts:element-geospatial-value-match and their Server-Side JavaScript equivalents. Searches involving geospatial queries use the precision of the governing coordinate system for determining matches and calculating scores.

    Understanding MarkLogic Geospatial Region Types

    This section provides a conceptual overview of the types of regions supported by MarkLogic. Points are the building block of most regions to learn more about points, see Understanding Points.

    Most geospatial interfaces in MarkLogic work with geospatial data represented as a cts:region (XQuery) or cts.region (Server-Side JavaScript), or an equivalent serialization. The cts region type is an abstraction that can represent any of the following concrete geospatial types:

    Boxes

    A geospatial box is a rectangular region consisting of all the points whose latitude and longitude coordinates are within the region bounds.

    In a geodetic coordinate system, a box is a projection from the three-dimensional Earth onto a flat surface. On the surface of the Earth, the edges of a box are arcs. When you project the edges onto a flat plane, they become two-dimensional latitude and longitude lines, and the space defined by those lines forms a rectangle.

    The following diagram uses a plate caree projection to illustrate the difference between the region defined by a box on the surface of the Earth and its projection into a rectangular region on a flat plane.

    In a geodetic coordinate system, the north and south edges of a box are latitude lines, not geodesic arcs. The east and west edges of a box are longitude lines, which are geodesic arcs. A box is not equivalent to a polygon with the same four vertices.

    A point is contained in a box if its latitude coordinate is between the north and south latitude coordinates of the box, and its longitude coordinate is between the west and east longitude coordinates of the box.

    In a Euclidean coordinate system, a box is simply a rectangle with boundaries defined by north, south, east, and west coordinates. In a Euclidean coordinate system, a box is equivalent to a polygon with the same four vertices.

    The following assumptions and restrictions only apply to boxes in a geodetic coordinate system:

      In a geodetic coordinate system, the west/east extent of a box is determined by starting at the western longitude coordinate and heading east toward the eastern longitude coordinate. If the west coordinate is less than the east coordinate, the box will not cross the anti-meridian. If the east coordinate is less than the west coordinate, the box crosses the anti-meridian. In a geodetic coordinate system, the south/north extent of a box is determined by starting at the southern latitude coordinate and heading north to the northern latitude coordinate. However, you cannot cross the pole: The northern coordinate must be greater than the southern coordinate.

    The following assumptions and restrictions apply to boxes under both geodetic and Euclidean coordinate systems:

      If the western and eastern coordinates are the same, the box is a meridian line segment between the southern and northern coordinates passing through that longitude coordinate. If the southern and northern coordinates are the same, the box is a latitude line segment between the western and eastern coordinates passing through that longitude coordinate. If the western and eastern coordinates are the same, and the southern and northern coordinates are the same, then the box is a point specified by those coordinates. During a search, the query options determine whether the boundaries of a box are included in or excluded from the box. Various boundary options on the geospatial query constructors control this behavior).

    In the raw coordinate system, the western coordinates are always less than or equal to the eastern coordinates, and the southern coordinates are always less than or equal to the northern coordinates.

    The cts:box XQuery type and cts.box JavaScript type represent a box in MarkLogic Server. You can create a box using the cts:box XQuery constructor or the cts.box JavaScript constructor. You can also create a box using one of the conversion utility functions such as geogml:box (XQuery) or geojson.box (JavaScript). For more details, see Constructing Geospatial Point and Region Values.

    Polygons

    A geospatial polygon is a region with three or more sides. The following diagram illustrates several polygons.

    In a geodetic coordinate system, a polygon can represent any area on the Earth (with the exceptions described below). For example, you might create a polygon to represent a country or a geographical region.

    Polygons offer a large degree of flexibility compared to circles or boxes. In exchange for the flexibility, operations on geospatial polygons are not quite as fast or accurate as geospatial box and circle operations.

    The efficiency of polygon operations is proportional to the number of sides to the polygon. For example, a typical 10-sided polygon will likely perform faster than a typical 1000-sided polygon. The speed is dependent on many factors, including where the polygon is, the nature of your geospatial data, and so on.

    The following assumptions and restrictions apply to polygons only under a geodetic coordinate system in MarkLogic:

      A geodetic coordinate system treats the earth as an ellipsoid. In such a system, the edges of a polygon are geodesic arcs, not latitude lines. A polygon cannot include both poles and cannot have both poles as a boundary (regardless of whether the boundaries are included). Thus, a polygon cannot encompass the full 180 degrees of latitude. The span of the arc described by a polygon edge in a geodetic coordinate system must be between 0 and 180 degrees and cannot cross a pole. If you need to span more than 180 degrees, define multiple edges that cover the desired span.

    Latitude lines are distinct from geodesic arcs. Except for the equator, the shortest distance between two points at the same latitude does not follow the latitude line. The edges of polygons are geodesic arcs, not latitude lines. You can approximate a latitude line by adding vertices evenly spaced along the latitude line. The north and south edges of a box are latitude lines if the region to be described is a box, use a cts:box or cts.box instead of a polygon.

    The following assumptions and restrictions apply to polygons under either a geodetic or Euclidean coordinate system in MarkLogic.

      No two edges of a polygon or complex polygon may overlap or cross. Coordinate system is considered at search time rather than when you construct a polygon value. Therefore, a search will throw a runtime exception if a polygon is not valid for the governing coordinate system. The boundaries of a polygon are either in or out of the polygon, depending on the operation and query options. The DE9IM operators include specific boundary behaviors for other operations, you can use query constructor options to control the boundary behavior.

    You can construct a polygon by specifying the points that make up the vertices of the polygon. All points that are bounded by the resulting region are defined to be contained within the region.

    For details, see the cts:polygon XQuery function or the cts.polygon JavaScript function.

    Complex Polygons

    A complex polygon is a polygon with one more holes. For example, the following graphic illustrates the difference between polygon and a complex polygon. The complex polygon is the shaded region in the region on the right. The unshaded region, or inner polygon, represents a hole in the outer polygon.

    You can construct a complex polygon by constructing an outer polygon with zero or more outer polygons. All inner polygons must be completely contained in the interior of the outer polygon. No two edges can cross or overlap. Use the cts:complex-polygon XQuery function or the cts.complexPolygon JavaScript function to construct a complex polygon.

    You can also cast a cts:complex-polygon or cts.complexPolygon with no holes (that is, with no inner polygons) to a cts:polygon or cts.polygon. If you specify multiple inner polygons, none of them should overlap each other.

    Linestrings

    A linestring is a connected sequence of edges. In a geodetic coordinate system, edges are geodesic arcs. In a Euclidean coordinate system such as raw, the edges are straight lines.

    A linestring does not necessarily form a closed loop as the boundary of a polygon does, although it is permissible for a linestring to form a closed loop. The following diagram demonstrates some examples of linestrings.

    You can compare linestrings for equality or inequality. Two linestrings are equal if all of their vertices are equal, or if they are both empty.

    To construct a linestring, use the cts:linestring XQuery function or the cts.linestring JavaScript function.

    Circles

    A geospatial circle consists of all the points within a certain distance (the radius) of a given center point. A geospatial region that represents a circle is defined by its center point and radius. The points that are the distance of the radius from the center define the boundary of the region.

    Use the cts:circle XQuery function or the cts.circle JavaScript function to construct a circle.

    Understanding Geospatial Query and Index Types

    This topic discusses the types of geospatial query you can create, the index types that support each query type, and the data layout expected by each query and index type. The following topics are covered:

    Introduction to Geospatial Query and Index Types

    MarkLogic supports several types of query for searching geospatial data contained in documents. In general, geospatial queries fall into the following two categories, based on the kind of geospatial document content to be matched:

      Point query: Match points in documents against points or other regions specified as input criteria. For example, Find all documents containing a point within this circle. Region query: Match other regions in documents that satisfy one of a number of relationships when compared to regions specified as input criteria. For example, Find all documents containing polygons that intersect with this polygon.

    For best performance, a point query should be supported by a corresponding geospatial index. A region query always requires a backing geospatial region index. MarkLogic supports several types of geospatial index, corresponding to the different geospatial query types.

    Select a geospatial point query or index type based on the layout of your data. The query or index type varies depending on whether the data is represented in XML or JSON, and whether the point coordinates are represented as a single compound value (lat lon) or as distinct latitude and longitude values. For example, you might use a cts:element-geospatial-query and a geospatial element index for points represented as a single compound XML element value.

    The data layout for a region query or region index must be WKT or a serialized cts region, such as a cts:polygon. The region data is located within a document using an XPath expression when creating a query or index. Therefore, you use a cts:geospatial-region-query and a geospatial region path index for querying by region.

    The following table summarizes the query and index types MarkLogic supports, based on the axes of geospatial content type (point or other region) and layout.

    Any coordinate pair addressable with an indexable XPath expression.

    Any serialized cts region or WKT value addressable with an indexable XPath expression.

    Geospatial Query Creation

    You can create a geospatial cts query in the following ways.

      Using an XQuery or Server-Side JavaScript query constructor, such as cts:element-geospatial-query (XQuery) or cts.pathGeospatialQuery (JavaScript).
  • Parsing query text containing a geospatial search term. For details, see Constructing a Point Query in XQuery or Constructing a Region Query from Query Text.
  • You can also create a geospatial structured query or Query By Example for use with the Search API or the Client APIs. The Java and Node.js Client APIs include builder interfaces for creating structured queries.

    For more details, see the sections on each query/index type elsewhere in this section and the following topics:

    Geospatial Index Creation

    Region queries require a region index, but an index is optional for some point queries. For best performance, you should usually create a geospatial index for both query types.

    You must have a valid geospatial license key to create or use any geospatial indexes.

    Use a geospatial region path index when matching regions in your documents. Use a geospatial point index when matching points in your documents the type of point index depends on the layout of your content . For details, see Introduction to Geospatial Query and Index Types.

    When creating a point index, you can specify the coordinate system, coordinate value precision, and point type (long-lat or lat-long). When creating a region index, you can specify the coordinate system and geohash precision. The default coordinate system is WGS84. The default coordinate precision is float (single precision), and the default point type is point (lat-long).

    When you create an index using the Admin API, index properties such as coordinate system and precision are specified through the index reference constructor function, such as admin:database-geospatial-element-index or admin:database-geospatial-region-path-index. For an example of index creation using the Admin API, see Configuring the Indexes.

    You can create a geospatial index using the following methods:

      Interactively, using the Admin Interface. See the Geospatial Point Indexes or Geospatial Region Indexes section under Database > database_name in the Admin Interface. Programmatically, using the server-side Admin API functions. For example, to create a geospatial element index, use the XQuery function admin:database-add-geospatial-element-index or the JavaScript function admin.databaseAddGeospatialElementIndex. Programmatically, using the REST Management API. For details, see the PUT /manage/v2/databases//properties method.

    For more details, see the sections on each query/index type, below.

    Geospatial XML Element Point Queries and Indexes

    Use a geospatial element query when the point coordinates in your documents are represented as the value of a single XML element, with the latitude and longitude values separated by whitespace or punctuation (except + , - , or .). For example:

    By default, the first coordinate is the latitude value, and the second coordinate is the longitude value. You can override the default order by specifying a longitude-first ordering when creating queries and indexes.

    If the element value contains other coordinates, they are ignored. For example, KML data can include an additional altitude coordinate. The altitude can be present but is ignored.

    When you use a geospatial element query, you should also create a corresponding geospatial element index for best performance.

    You can use the following interfaces to create a geospatial element query:

    Interface Query Constructor
    XQuery cts:element-geospatial-query
    Server-Side JavaScript cts.elementGeospatialQuery
    Structured Query geo-elem-query
    Java Client API
    Node.js Client API queryBuilder.geoElement

    You can use the following interfaces to create a geospatial element index:

    Interface Index Construction Method
    Admin Interface Databases >. > Geospatial Point Indexes > Geospatial Element Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial XML Element Child Point Queries and Indexes

    Use a geospatial element child index for geospatial point data when the coordinates are contained in an XML element value, separated by whitespace or punctuation (except + , - , or .), and you want to identify the container element as a child of another specific element. For example:

    By default, the first coordinate is the latitude value, and the second coordinate is the longitude value. You can override the default order by specifying a longitude-first ordering when creating queries and indexes.

    If the element value contains other coordinates, they are ignored. For example, KML data can include an additional altitude coordinate. The altitude can be present but is ignored.

    When you use a geospatial element child query, you should also create a corresponding geospatial element child index for best performance.

    You can use the following interfaces to create a geospatial element child query:

    Interface Query Constructor
    XQuery cts:element-child-geospatial-query
    Server-Side JavaScript cts.elementChildGeospatialQuery
    Structured Query geo-elem-query
    Java Client API
    Node.js Client API queryBuilder.geoElement

    You can use the following interfaces to create a geospatial element child index:

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Element Child Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-child-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial XML Element Pair Point Queries and Indexes

    Use a geospatial element pair index for geospatial point data when the longitude and latitude are values in two different elements that are children of the same parent element. For example:

    You can use the following interfaces to create a geospatial element pair query:

    Interface Query Constructor
    XQuery cts:element-pair-geospatial-query
    Server-Side JavaScript cts.elementPairGeospatialQuery
    Structured Query geo-elem-pair-query
    Java Client API
    Node.js Client API queryBuilder.geoElementPair

    You can use the following interfaces to create a geospatial element pair index:

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Element Pair Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-child-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial XML Attribute Pair Point Queries and Indexes

    Use a geospatial attribute pair index for geospatial point data when the longitude and latitude are values in two different attributes of the same parent XML element. For example:

    When you use a geospatial attribute pair query, you should also create a corresponding geospatial attribute pair index for best performance.

    You can use the following interfaces to create a geospatial element attribute pair query:

    Interface Query Constructor
    XQuery cts:element-attribute-pair-geospatial-query
    Server-Side JavaScript cts.elementAttributePairGeospatialQuery
    Structured Query geo-attr-pair-query
    Java Client API
    Node.js Client API queryBuilder.geoElement

    You can use the following interfaces to create a geospatial element attribute pair index:

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Attribute Pair Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-attribute-pair-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial Path Point Queries and Indexes

    Use a geospatial path query and index for matching points when you want to express the location of the points using an XPath expression. The data layout must be one of the following:

      A single XML element value with the latitude and longitude coordinates separated by whitespace or punctuation, as for a geospatial element query. A single JSON property value with the latitude and longitude coordinates separated by whitespace or punctuation, as for a geospatial JSON property query. A JSON array value containing a latitude element and a longitude element, as for a geospatial JSON property query.

    By default, the first coordinate is the latitude value, and the second coordinate is the longitude value. You can override the default order by specifying a longitude-first ordering when creating queries and indexes.

    The path expression with which you define the index is limited to a subset of XPath for performance reasons. For details, see Path Field and Path-Based Range Index Configuration in the XQuery and XSLT Reference Guide.

    The following table demonstrates the XPath expression to use when creating a path range index for several forms of example geospatial data.

    Document Type Example Data Indexing Path Expression

    Once you create a geospatial path range index, you cannot change the path expression. To change the path, you must remove the existing geospatial path range index and create a new one.

    You can use the following interfaces to create a geospatial path query:

    Interface Query Constructor
    XQuery cts:path-geospatial-query
    Server-Side JavaScript cts.pathGeospatialQuery
    Structured Query geo-path-query
    Java Client API
    Node.js Client API queryBuilder.geoPath and queryBuilder.geospatial

    You can use the following interfaces to create a geospatial path index:

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Path Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-path-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial JSON Property Point Queries and Indexes

    Use a geospatial element index to index geospatial data in JSON documents when the point coordinates are contained in a single JSON property. The geospatial data must be represented in the property value as either whitespace/punctuation separated values in a string, or as an array of values. For example:

    By default, the first coordinate is the latitude value, and the second coordinate is the longitude value. You can override the default order by specifying a longitude-first ordering when creating queries and indexes. The property value can include other entries, but they are ignored (for example, KML has an additional altitude coordinate, which can be present but is ignored).

    You can use the following interfaces to create a geospatial JSON property query:

    Interface Query Constructor
    XQuery cts:json-property-geospatial-query
    Server-Side JavaScript cts.jsonPropertyGeospatialQuery
    Structured Query geo-json-property-query
    Java Client API
    Node.js Client API queryBuilder.geoProperty

    You can use the following interfaces to create an index for a geospatial JSON property query. Note you should create a geospatial element index, even though you are indexing JSON content.

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Element Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial JSON Property Child Point Queries and Indexes

    Use a geospatial element child index to index geospatial data in JSON when you want to limit the index to coordinate properties contained in a specific property. The geospatial data must be represented in the child property value as either whitespace/punctuation separated values in a string, or as an array of values.

    For example, if your data looks like one of the following, you could create a geospatial element child index specifying " theParent " as the parent element (property) and " theChild " as the child element (property).

    By default, the first coordinate is the latitude value, and the second coordinate is the longitude value. You can override the default order by specifying a longitude-first ordering when creating queries and indexes. The property value can include other entries, but they are ignored (for example, KML has an additional altitude coordinate, which can be present but is ignored).

    You can use the following interfaces to create a geospatial JSON property child query:

    Interface Query Constructor
    XQuery cts:json-property-child-geospatial-query
    Server-Side JavaScript cts.jsonPropertyChildGeospatialQuery
    Structured Query geo-json-property-query
    Java Client API
    Node.js Client API queryBuilder.geoProperty

    You can use the following interfaces to create an index for a geospatial JSON property child query. Note you should create a geospatial element child index, even though you are indexing JSON content.

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Element Child Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-child-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial JSON Property Pair Point Queries and Indexes

    Use a geospatial element pair index to index geospatial data in JSON when the point coordinates are contained in sibling JSON properties. For example, use this type of index when working with data similar to the following:

    You can use the following interfaces to create a geospatial JSON property pair query:

    Interface Query Constructor
    XQuery cts:json-property-pair-geospatial-query
    Server-Side JavaScript cts.jsonPropertyPairGeospatialQuery
    Structured Query geo-json-property-pair-query
    Java Client API
    Node.js Client API queryBuilder.geoPropertyPair

    You can use the following interfaces to create an index for a geospatial JSON property pair query. Note you should create a geospatial element pair index, even though you are indexing JSON content.

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Point Indexes > Geospatial Element Pair Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-element-pair-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial Region Queries and Indexes

    Use a geospatial region path index to index geospatial regions, such as polygons, rather than points. A geospatial region path index supports operations such as the cts:geospatial-region-query XQuery function and the cts.geospatialRegionQuery JavaScript function. These functions enable you to test for relationships between regions, such as overlaps and contains.

    Region indexes over geodetic coordinate systems are based on geohashing. Geohashes of circles are calculated by approximating the circle by a polygon. The approximation is accurate to within 0.001% of the radius of the circle. If you require more precision, use geo:circle-polygon to convert circles in your data.

    When working with large circular regions, you might need to adjust the tolerance in your geospatial operations. For details, see Understanding Tolerance.

    The path expression with which you define a region index is limited to a subset of XPath for performance reasons. For details, see Path Field and Path-Based Range Index Configuration in the XQuery and XSLT Reference Guide.

    The content referenced by the path expression in a geospatial region index must be a region represented as either WKT or a serialized cts:region. For example:

    Format Example Data Indexing Path Expression

    If your data is not in the expected format, you can use an envelope pattern to encapsulate your original data along with a supported format. For more details, see Example: Using the Envelope Pattern to Encode Regions.

    You can use the following interfaces to create a geospatial region path query. For more details, see Searching for Matching Regions.

    Interface Query Constructor
    XQuery cts:geospatial-region-query
    Server-Side JavaScript cts.geospatialRegionQuery
    Structured Query geo-region-path-query and geo-region-constraint-query
    Java Client API
    Node.js Client API queryBuilder.geoPath and queryBuilder.geospatialRegion

    You can use the following interfaces to create a geospatial region path index.

    Interface Index Construction Method
    Admin Interface Databases > . > Geospatial Region Indexes
    XQuery Admin API (also usable with JavaScript) admin:database-add-geospatial-region-path-index
    REST Management API PUT /manage/v2/databases//properties

    Geospatial Index Positions

    Each geospatial point index has a range value positions option. Enabling range value positions speeds up queries that constrain a search by the distance between geospatial data and other search terms in a document, such as when using cts:near-query in XQuery or cts.nearQuery in Javascript.

    Additionally, enabling element positions improves index resolution (more accurate estimates) for XML element and JSON property queries that involve geospatial point queries (with a geospatial index with positions enabled for the geospatial data).

    Geospatial Lexicons

    Geospatial point indexes enable geospatial lexicon lookups. The lexicon lookups enable very fast retrieval of geospatial values. For details on geospatial lexicons, see Geospatial Lexicons.

    Index Reference Resolution

    Many geospatial operations either require or will take advantage of available geospatial indexes. Depending on the operation, the index reference might be explicit or implicit. For example, if you supply a cts:reference to an operation, the index reference is explicit. By contrast, when you supply an XPath expression, XML element QName, or JSON property name to a query constructor, the index reference is implicit.

    Often, an index reference doesn't fully specify the characteristics of an index. For example, if you create a region path query and specify no options, you've only supplied the type of index (geospatial region path index) and the path. You have not explicitly specified the coordinate system, precision, or point type. Thus, they implicitly default to wgs84, single, and point, respectively.

    MarkLogic attempts to resolve an index reference from the information in the call, including options, plus the defaults. If this is sufficient to identify a unique index, that index will be used. If it is not, an error is raised.

    For example, suppose you create a geospatial region index on the path /coordinates , with coordinate-system and precision wgs84/double. If you then construct a region query on the path /coordinates and specify the option coordinate-system=wgs84, the precision is implicitly single precision, which will not match the only available index. You will get a XDMP-GIDXNOTFOUND error.

    Similarly, suppose you create one geospatial region index on the path /coordinates , with coordinate-system and precision wgs84/double and another on the same path with wgs84 (single precision). If you then create a region path query on /coordinates and do not specify the coordinate system, the index reference is ambiguous and you will get a XDMP-GIDXAMBIGUOUS error.

    Searching for Matching Points

    This section describes how to use a point query to find documents containing specific points or documents containing points in specific regions. You should use a point query rather than a region query when searching for points because point queries are usually faster than region queries.

    This section covers the following topics:

    Point Search Overview

    A point query finds documents containing one or more points that match search criteria regions. The search criteria regions can be points, circles, linestrings, polygons, or any other cts region type. (To find matching regions, rather than points, see Searching for Matching Regions).

    The following are key features of searching with point queries:

      A point matches a criteria region if it is contained in the region. You can use options to control whether or not the criteria region boundaries should be considered in the match. Boundaries are included by default. You can use point queries with the same search framework as other kinds of queries, such as cts:search, cts.search, jsearch.documents , search:search, or the Client APIs. You can use a point query by itself or as a component of a more complex query, such as a cts:and-query (XQuery) or cts.andQuery (JavaScript). You can construct a geospatial point query using an XQuery or JavaScript query constructor, by parsing query text, or using the REST, Java, or Node.js Client APIs. Creating appropriate geospatial point indexes can improve speed and accuracy.

    Indexes are required for certain kinds of queries, such as range queries. Indexes are optional for queries such as value queries, but only if you use unfiltered search. For details, see Fast Pagination and Unfiltered Searches in the Query Performance and Tuning Guide.

    For example, the following search uses an element child geospatial query to match documents containing at least one point in the circle with center (37.5073428,-122.2465038) and radius 1 mile. The circle criteria region is constructued using the cts:circle XQuery function or cts.circle JavaScript function.

    Language Example
    XQuery
    JavaScript

    (The above queries were written for sample documents containing KML geospatial data, so an element child query is used to confine matches to coordinates in KML <Point/> elements. The long-lat point type is used because KML coordinates are expressed in longitude-first order.)

    The MarkLogic APIs also include geospatial utility functions useful for constructing criteria and analyzing search matches. For example, you can use the cts:region-contains XQuery function or the cts.region-contains JavaScript function to test whether one region contains another. The utility functions are usable with in-memory geospatial data, as well as data in documents in the database. For details, see Summary of Other Geospatial Operations.

    Example: Point Query Using XQuery

    This example uses XQuery to demonstrate the following type of point queries:

    For an equivalent Server-Side JavaScript example, see Example: Point Query Using JavaScript. The example assumes the data and database configuration from Preparing to Run the Examples.

    The sample data is XML documents containing KML data of the following form. For more details on the sample documents, see Overview of the Sample Data.

    The example uses cts:element-child-geospatial-query to find matches in coordinates element of a KML Point element. Limiting the scope to coordinates in a Point element prevents false positives from the documents containing other kinds of regions. For example:

    The query includes the type=long-lat-point option because KML uses longitude-first coordinate order while the default in MarkLogic is latitude-first ( "type=point" ).

    The database configuration includes a corresponding geospatial element child index on kml:Point/kml:coordinates with long-lat point type.

    The following code performs one search for documents containing the coordinates of the MarkLogic headquarters (cts:point(37.5073428, -122.2465038)) and one search for documents containing points in the MarkLogic Neighborhood polygon. The polygon coordinates are extracted from one of the sample documents, but you could also construct them inline.

    If you run this query in Query Console, it produces output similar to the following:

    You can compose complex queries by combining geospatial queries with other query types. For example, the following code matches documents that contains points within a circle and that also contain the word MarkLogic:

    Though the previous examples only searched the XML sample documents, you can apply a geospatial query to either XML or JSON documents, or both. For example, the following code searches both the XML and JSON sample documents by combining two geospatial queries in an OR query. (The point search criteria matches the MarkLogic HQ feature in the sample documents.)

    Running this query in Query Console produces the following output:

    Example: Point Query Using JavaScript

    This example uses Server-Side JavaScript to demonstrate the following type of point queries:

    For an equivalent XQuery example, see Example: Point Query Using XQuery. This example assumes the data and database configuration from Preparing to Run the Examples.

    The sample data includes JSON documents containing GeoJSON data of the following form. For more details on the sample documents, see Overview of the Sample Data.

    The example uses cts.pathGeospatialQuery to find matching documents. You must use a path query for point queries on GeoJSON for the reasons described in Geospatial Data in the Application Developer's Guide. The following path addresses the coordinates array of a point feature in the sample documents:

    Thus, the core of the search is a path query of the following form. The query includes the type=long-lat-point option because GeoJSON uses longitude-first coordinate order while the default in MarkLogic is latitude-first ( "type=point" ).

    The database configuration must include a corresponding geospatial path index. The instructions in Preparing to Run the Examples include creating a suitable index.

    The following code uses the JSearch API to perform 2 searches: one search for documents containing a point (cts.point(37.5073428, -122.2465038)), and one for documents containing points in a region. The region coordinates are extracted from one of the sample documents for convenience, but you could also construct the region inline using a geospatial constructor such as cts.polygon.

    If you run this query in Query Console, it produces output similar to the following:

    Note that a lambda expression and the map method are used to extract just the feature names from the matched documents:

    This is a contrivance used to keep the example output brief. If you remove the map call, the search returns a Sequence of document descriptors that include the full document. For more details, see Creating JavaScript Search Applications.

    You can also use cts.search to perform an equivalent search. For example:

    You can include multiple criteria in a single query when you do so, encapsulate the criteria in an array. For example, you could search for matches to both the point and the region with a query such as the following. A document matches if it matches any one of the criteria

    Since the MarkLogic HQ feature document satisfies both cirteria and the Museum and Restaurant feature documents satisfy the region criteria, the above query matches the Restaurant, Museum, and MarkLogic HQ features.

    You can compose complex queries by combining geospatial queries with other query types. Notice that the above cts.search example search uses a cts.andQuery to combine the geospatial path query with a collection query that constrains the search to the JSON documents in the sample set.

    To include the XML sample documents in the search, add a cts.elementChildGeospatialQuery on the KML data. For example, the following query finds documents containing the MarkLogic HQ coordinates in either the XML or JSON sample documents, and prints out the URIs of the matched documents:

    Running the above query in Query Console, produces the following output:

    Constructing a Point Query in XQuery

    This section is a quick reference of available XQuery geospatial point query constructors. These functions create a cts:query object. For an equivalent JavaScript reference, see Constructing a Point Query in JavaScript. To create geospatial queries from query text, see Constructing a Point Query from Query Text.

    Use the following functions to construct a point query. Select the query constructor that corresponds to the type of region and layout of the data to be searched, as described in Understanding Geospatial Query and Index Types. You can use these constructors with each other and with other cts:query constructors to build up complex queries.

    Every query constructor includes parameters that identify the content to search, either by path, name, or index reference and one or more geospatial values to match. For example:

    A geospatial query is constrained to the XML elements, XML attributes, and JSON properties identified in the query constructor. To cross multiple formats in a single search, use cts:or-query to combine multiple geospatial queries.

    Constructing a Point Query in JavaScript

    This section is a quick reference of available Server-Side JavaScript geospatial point query constructors. These functions create a cts.query object. For an equivalent XQuery reference, see Constructing a Point Query in XQuery. To create geospatial queries from query text, see Constructing a Point Query from Query Text.

    The following JavaScript geospatial query constructors are available. You can use these constructors with each other and with other cts:query constructors to build up complex queries. Select the query constructor that corresponds to the type of region and layout of the data to be searched, as described in Understanding Geospatial Query and Index Types.

    Every query constructor includes parameters that identify the content to search, either by path, name, or index reference and one or more geospatial values to match. For example:

    A geospatial query is constrained to the XML elements, XML attributes, and JSON properties identified in the query constructor. To cross multiple formats in a single search cts.orQuery to combine multiple geospatial queries.

    For a complete example, see Example: Point Query Using XQuery. For more details about constructing geospatial search criteria, see Constructing Geospatial Point and Region Values.

    Constructing a Point Query from Query Text

    You can use the cts:parse XQuery function or the cts.parse JavaScript function to create a geospatial point query from query text. The parse creates a cts query object. This grammar is only supported by the cts parser the grammar used by search:search or search:resolve does not support geospatial terms.

    The cts parse grammar supports search terms expressing points, circles, boxes, polygons, and other regions, bound a geospatial index reference. For details, see Binding to a Geospatial Index Reference.

    The following example queries create a geospatial element child query over KML point coordinates. The bindings define the interpretation of the poi (point of interest) tag as a reference to a geospatial element child index. The query text @1 -122.2465038,37.5073428 represents a circle with radius 1 mile (the default units) and center (37.5073428, -122.2465038). The query includes the option type=long-lat-point because KML uses longitude-first ordering for points, while the MarkLogic default ordering is latitude-first.

    Language Example
    XML
    JavaScript

    The parse produces a cts query similar to the following:

    Language Example Output
    XQuery
    JavaScript

    Creating Point Queries with the Client APIs

    The REST, Java, and Node.js Client APIs expose geospatial queries through the use of structured queries and query builders, rather than through standalone query constructor functions.

    The following topics provide examples of using a point query from the Client APIs:

    You can also use a serialized cts query or a structured query with the Node.js, Java, or REST Client APIs and Search API functions such as search:resolve. See the following topics for details on including a point query in a structured query:

    Java Client API

    This topic assumes you are already familiar with the search features of the Java Client API. If you are not, see the Java Application Developer's Guide.

    You are most likely to construct a geospatial point query with the Java Client API using a StructuredQueryBuilder object. You could also embed a structured point query in a RawCombinedQuery this technique is not covered here. You cannot create a geospatial point query in Java using query text or QBE.

    Each geospatial point query can only reference a single point index. To search more than one index, construct multiple point queries and combine them with an OR query.

    Use StructuredQueryDefinition.geospatial to create a point query. Choose an overload that accepts a GeospatialIndex as input. A GeospatialIndex object identifies the point index to be searched.

    To construct a GeospatialIndex object, use one of the geospatial index builders of StructuredQueryBuilder, such as StructuredQueryBuilder.geoElement . Choose the index builder that matches your index and data layout for details, see Understanding Geospatial Query and Index Types.

    For example, the following code snippet identifies a geospatial element child point index corresponding to the KML Point features in the data from Preparing to Run the Examples.

    The following example uses the Java Client API to find XML documents containing the feature named MarkLogic HQ.

    If you run the above program against the sample data and database configuration from Preparing to Run the Examples, you should see output similar to the following:

    The example as written will only match the XML sample documents from Preparing to Run the Examples. You can match the JSON sample documents by changing the index builder to use StructuredQueryBuilder.geoPath and the path geometry[type = "Point"]/array-node("coordinates") , similar to the example in Example: Point Query Using JavaScript.

    Node.js Client API

    This topic assumes you are familiar with the search features of the Node.js Client API. If you are not, you should review the Node.js Application Developer's Guide.

    To construct a geospatial point query, use queryBuilder.geospatial. Use one of the geospatial index reference builders such as queryBuilder.geoPath or queryBuilder.geoElement to construct the point index specification. Choose the builder that corresponds to your index and data layout, as described in Understanding Geospatial Query and Index Types. Use helper functions such as queryBuilder.point to construct the criteria point(s) or regions(s).

    Each geospatial point query can only reference a single index. To search more than one index, construct multiple region queries and combine them with an OR query.

    The following example performs the same search as Example: Simple Intersection Region Query. The example relies on the sample documents and database configuration from Preparing to Run the Examples. Before running the example, modify the connection information in connInfo .

    If you run the example against the sample data from Preparing to Run the Examples, you should see output similar to the following:

    As written, the sample code will only match the XML documents in the sample data. To match the JSON documents, use queryBuilder.geoPath instead of queryBuilder.geoElement and the path geometry[type = "Point"]/array-node("coordinates") , similar to the example in Example: Point Query Using JavaScript.

    Creating Geospatial Facets

    Faceted navigation of search results enables users to filter large or complex search results by properties of the data. For example, filter a list of clothing items by size, color, and material. One technique for faceting the results of a point query is to define geospatial boxes that enclose the matched points.

    If you divide a geospatial box into a grid of boxes, you can bucket matched points by the sub-divisions. Each subdivision represents a facet.

    For example, the following diagram plots a series of matched points on a 5x5 grid.

    You can use the number of points in each box and the box extent with mapping APIs like Google Maps to generate a heat map from the search results.

    You can generate such geospatial facet data using the cts:*-geospatial-boxes XQuery functions and the cts.*GeospatialBoxes Server-Side JavaScript functions, such as cts:element-geospatial-boxes or cts.elementGeospatialBoxes. You can use the cts:frequency XQuery function or the cts.frequency JavaScript function to compute the number of points in each box.

    The XQuery Search API, JavaScript jsearch API, and the Client APIs an equivalent capability at a higher level of abstraction through the heatmap component of a geospatial constraint definition.

    All of these interfaces enable you to define the extent of the box over which to generate facets, and the latitudes and longitudes of the subdivisions.

    You can also control whether the returned facet box extents are based on an even grid or the minimum bounding box that encompasses the points in a given bucket, as shown in the following diagram. By default, MarkLogic generates the minimum bounding boxes. Use the gridded option to override the default.

    The following diagram illustrates the difference between gridded and non-gridded faceting:

    The following XQuery example generates geospatial facets and their counts cts:element-pair-geospatial-boxes :

    For the Server-Side JavaScript jsearch API, use jsearch.makeHeatMap with FacetDefinition.groupInto to define geospatial facets. For details, see Grouping Values and Facets Into Buckets.

    For the XQuery SearchAPI and the client APIs, use the heatmap component of a geospatial constraint query option. In these interfaces, you need only specify the faceting box extent and the number of latitude and longitude divisions in the heatmap component. The underlying API computes the grid, the facet boxes, and the counts for you. For example:

    For a more complete example, see Geospatial Constraint Example.

    Searching for Matching Regions

    This section describes how to use a region query to search for regions in your documents. You can match regions using topological operators such as as containment and intersection.

    This section covers the following topics:

    Region Match Overview

    A region query matches geospatial regions in your documents against one or more criteria regions. The relationship between the regions that must be satisfied for a match is determined by the operator configured into the query. For example, you can create a query that matches documents containing a region that overlaps or intersects your criteria region(s).

    To construct a region query, use the XQuery cts:geospatial-region-query function or the JavaScript cts.geospatialRegionQuery function. Region queries require a geospatial region path index.

    For example, the following code snippet creates a query the matches documents containing a region the intersects a circle. For a complete example, see Example: Simple Intersection Region Query.

    Language Example
    XQuery
    JavaScript

    The following are key points about using region queries:

      One region matches another if it satisfies the topological operator configured into the query. You can choose from the following operators: contains, covered-by, covers, crosses, disjoint, equals, intersects, overlaps, within. For details, see cts:geospatial-region-query in the MarkLogic XQuery and XSLT Function Reference or cts.geospatialRegionQuery in the MarkLogic Server-Side JavaScript Function Reference.
    • You must define a geospatial region path index on any regions you want to search with a region query. For details, see Geospatial Region Queries and Indexes.
    • The regions in your documents must be in WKT or serialized cts:region format. If your data is not in one of these formats, you must transform it to conform. For one possible solution, see Example: Using the Envelope Pattern to Encode Regions. You can use a region query with the same search framework as other kinds of queries, such as cts:search, cts.search, jsearch.documents, search:search, or the Client APIs. You can use a region query by itself or as a component of a more complex query, such as a cts:and-query (XQuery) or cts.andQuery (JavaScript). You can construct a geospatial region query using an XQuery or JavaScript query constructor, by parsing query text, or using the REST, Java, or Node.js Client APIs. For best performance, when matching against individual points in your documents, you should usually use a point query rather than a region query.

    The MarkLogic APIs also include geospatial utility functions useful for constructing criteria and analyzing search matches. For example, you can use the cts:region-contains XQuery function or the cts.region-contains JavaScript function to test whether one region contains another. The utility functions are usable with in-memory geospatial data, as well as data in documents in the database. For details, see Summary of Other Geospatial Operations.

    Example: Simple Intersection Region Query

    This example depends on the data and configuration in Preparing to Run the Examples.

    Run one of the following queries in Query Console to find documents that contain a region that intersects with a polygon. The criteria polygon corresponds to the MarkLogic Neighborhood region in the sample data. The query produces the feature names from the matched documents.

    Language Example
    XQuery
    JavaScript

    If you run one of these queries in Query Console, the following feature names should be displayed:

    You can also work with XML documents in JavaScript and work with JSON documents in XQuery, as shown below. The following example performs the same region search against the opposite document type.

    Language Example
    XQuery
    JavaScript

    Notice that the result processing in JavaScript is significantly different because you cannot handle the matched XML documents as native javascript objects.

    The following example searches the combined set of both XML and JSON documents. Notice that you can pass multiple region index references into the geospatial region query constructor. A document satisfies the query if a match is found using any of the indexes.

    Language Example
    XQuery
    JavaScript

    If you run one of these queries in Query Console, it emits the following list of URIs:

    Example: Using Region Queries in a Composed Query

    This example depends on the data and configuration in Preparing to Run the Examples.

    You can use geospatial region queries along with other query types to compose more complex queries. For example, the following queries find documents containing a region that intersects with the MarkLogic Neighborhood region, but that do not contain a region that is covered by the MarkLogic Neighborhood region.

    Language Example
    XQuery
    JavaScript

    Query Console displays the following feature names if the query is successful:

    Constructing a Region Query Using a Constructor

    This section demonstrates how to use the constructor functions cts:geospatial-region-query (XQuery) or cts.geospatialRegionQuery (JavaScript) to construct a region query on MarkLogic Server. You can also construct a region query from query text using cts:parse (XQuery) or cts.parse (JavaScript) for details, see Constructing a Region Query from Query Text.

    A region query has the following form:

    The options and weight parameters are optional. You can specify multiple region indexes and multiple criteria regions, which is treated as an implicit OR query. That is, a document matches if it satisfies any of the comparisons.

    A region query must be backed by a corresponding geospatial region path index. For more details, see Geospatial Region Queries and Indexes.

    For example, the following constructor creates a region query that matches region data located at the XPath /envelope/cts-region the overlap with a circle with center (-122.2465,37.507343) and radius 1 mile.

    Language Example Output
    XQuery
    JavaScript

    The operator must be a string with one of the following values, corresponding to the DE9-IM predicates.

    You can construct your criteria regions using region constructors such as the cts:polygon (XQuery) or cts.polygon (JavaScript), the geospatial format conversion functions such as geogml:linestring (XQuery) or geojson.circle (JavaScript), or using WKT or WKB.

    A geospatial query is constrained to the XML elements, XML attributes, and JSON properties identified in the query constructor. To cross multiple formats in a single search, use cts:or-query in XQuery or cts.orQuery in JavaScript to combine multiple geospatial queries.

    For more details, see the following topics:

    Constructing a Region Query from Query Text

    You can use the cts:parse XQuery function or the cts.parse JavaScript function to create a geospatial region query from query text. If you bind a tag to a geospatial region path index, then you can use the tag in an expression of the following form in your query text:

    For example, if reg is the name of a tag bound to a geospatial region index, then the following expression parses to a geospatial region query than matches regions in your documents that overlap with the given polygon

    The operator must be one of the following. For more details on the operators, see Operators Usable with Geospatial Queries and http://en.wikipedia.org/wiki/DE-9IM.

      DE9IM_CONTAINS DE9IM_COVERED_BY DE9IM_COVERS DE9IM_CROSSES DE9IM_DISJOINT DE9IM_EQUALS DE9IM_INTERSECTS DE9IM_OVERLAPS DE9IM_TOUCHES DE9IM_WITHIN

    The following example queries illustrate the tag binding and parsing necessary to search using this query text. The binding specifies the tag reg represents a geospatial region path index reference for the XPath expression /envelope/cts-region. The query text @1 -122.2465038,37.5073428 represents a circle with radius 1 mile (the default units) and center (37.5073428, -122.2465038).

    Language Example
    XML
    JavaScript

    The parse produces a cts query similar to the following:

    Language Example Output
    XQuery
    JavaScript

    Creating Region Queries Using the Client APIs

    See the following topics for an overview and example of using a region query in a search in the Client APIs.

    JavaClient API

    This topic assumes you are already familiar with the search features of the Java Client API. If you are not, see the Java Application Developer's Guide.

    You are most likely to construct a geospatial region query with the Java Client API using the StructuredQueryBuilder. You could also embed a structured region query in a RawCombinedQuery this technique is not covered here. You cannot create a geospatial region query in Java using query text or QBE.

    Each geospatial region query can only reference a single region index. To search more than one index, construct multiple region queries and combine them with an OR query.

    Use StructuredQueryDefinition.geospatial to create a region query. Choose an overload that accepts a GeospatialRegionIndex as input. A GeospatialRegionIndex object identifies the region index to be searched.

    To construct a GeospatialRegionIndex object, use StructuredQueryBuilder.geoRegionPath . When defining the index, you must include a PathIndex value, and you may also include coordinate system and precision information.

    For example, the following code snippet identifies a region index on the path /envelope/cts-region with the coordinate system wgs84.

    The following example uses the Java Client API to build a structured query equivalen to the query in Example: Simple Intersection Region Query. The example as written will only match the XML sample documents from Preparing to Run the Examples. You can match the JSON sample documents by changing the index path to /envelope/ctsRegion .

    If you run the above program against the sample data and database configuration from Preparing to Run the Examples, you should see output similar to the following:

    Node.js Client API

    This topic assumes you are familiar with the search features of the Node.js Client API. If you are not, you may want to review the Node.js Application Developer's Guide.

    To construct a geospatial region query, use queryBuilder.geospatialRegion. You cannot create a region query using queryBuilder.parsedFrom or queryBuilder.byExample . Use queryBuilder.geoPath to construct the region index specification, and helper functions such as queryBuilder.polygon to construct the criteria region(s).

    Each geospatial region query can only reference a single region index. To search more than one index, construct multiple region queries and combine them with an OR query.

    The following example performs the same search as Example: Simple Intersection Region Query. The example relies on the sample documents and database configuration from Preparing to Run the Examples. Before running the example, modify the connection information in connInfo .

    If you run the example against the sample data from Preparing to Run the Examples, you should see output similar to the following:

    REST Client API

    This topic assumes you are already familiar with the search features of the REST Client API. If you are not, refer to the REST Application Developer's Guide.

    To evaluate a geospatial region using the REST Client API, you can use either a cts:geospatial-region-query or a structured query that contains a geo-region-path-query or a geo-region-constraint-query. Your cts or structured query can be standalone or part of a combined query. You cannot construct query text or a QBE that represents a region query.

    Each structured region query can reference only one region index. To search more than one region index at a time, create multiple region queries and combine them with an or-query .

    The following example uses the REST Client API and a structured query to perform the same search as the one in Example: Simple Intersection Region Query. The example as written will only match the XML sample documents from Preparing to Run the Examples. You can match the JSON documents by changing the path-index to /envelope/ctsRegion .

    Copy the following query into a file. You will use it the file as the POST body of a search request. The example curl command below assumes the filename is body.xml .

    Run a curl command similar the following to perform the search. Before running the command, change the username and password. If you are not using the Document database as your content database, you will need to add a database request parameter to the URL.

    The search should match the following documents:

    The following is the equivalent structured query, expressed as JSON.

    You can use this query with a curl command similar to the XML example. Just change the request body content type and, potentially, name of the file containing the body. For example:

    Example: Using the Envelope Pattern to Encode Regions

    Content you search with a region query must be in WKT or serialized cts:region format. This example illustrates using the envelope pattern to encapsulate the searchable region format with original data in an incompatible format. This is not the only solution to this problem. For example, you can tranform your content before ingesting it into MarkLogic, or you can replace the unsupported original format entirely, rather than persisting both.

    The example reads in a file from the file system that contains an aggregate XML element contains a several KML Placemark elements. The data is disaggregated into one file per Placemark, and then each Placmark is wrapped in an envelope that contains both the original data and the serialized representation of a cts:region that corresponds to the region in the original data.

    For example, if the original input file has the following structure:

    Then the result is one document per Placemark, with the following structure. The envelope root element and the cts-region element are created by the ingest transformation.

    The following example query ingests the original data and creates a document from the envelope it wraps around each Placemark:

    Points are not translated by the above example simply because it was not necessary for this purpose. You can index and use point queries on in KML points without transformation. You only need to transform them if you want to use a region query on point data.

    Controlling Coordinate System and Precision

    The Relationship Between Precision and Coordinate System

    The coordinate system and precision are conflated in the coordinate system name in many operations that accept a coordinate system name as input.

    For example, when you specify wgs84 as the value of the coordinate-system option in a query constructor, it also implicitly specifies single precision. Similarly, a value of wgs84/double specifies both the WGS84 coordinate system and double precision.

    In many interfaces, you can use a precision option or parameter to override the precision implied by the coordinate system name.

    Determining the Best Precision for Your Application

    MarkLogic always preserves the precision of geospatial data in your documents. For example, if you ingest documents containing double precision coordinate values, those values retain full precision, even if the governing coordinate system during ingestion specifies single precision.

    However, the precision of values in a geospatial index is determined by the configuration of the index. Thus, you might have double precision values in your documents, but single precision values in the corresponding geospatial index.

    A double precision index enables a greater degree of accuracy when computing geospatial search matches, but at the cost of increased memory requirements and some computational overhead.

    Greater precision does not equate to greater accuracy. Most applications do not require double precision indexing.

    For example, geospatial queries against single precision indexes are accurate to within 1 meter for geodetic coordinate systems. If your application does not require sub-meter accuracy, then there is no reason to incur the overhead of a double precision index.

    The following are examples of geospatial applications that might require double precision:

      Tracking equipment moving around a facility. Tracking room-to-room movements within a building. Tracking slow-moving objects that move in sub-meter increments, such as fault lines and tectonic plates. Tracking assets that require high placement precision, such on which side of a street a fire hydrant is located.

    Excessive precision can cause difficulty for geospatial operations. For example, when comparing two points at double precision, they will fail a test for equality if the coordinate values differ when compared at the level of microns. Most applications would consider such a difference in the noise and consider these points the same. Comparison of double-precision coordinates assumes a tolerance of zero by default, meaning they must match exactly, at all digits of precision. This affects operations such as comparison of points, testing whether a point is on a edge, and testing two edges for adjacency. You can use the tolerance option available on some operations to enable less precise comparisons. For more details, see Understanding Tolerance.

    An application can use a mix of single and double precision geospatial indexes and operations. For example, you can define both a single and a double precision index over the same data. You can specify precision per operation.

    You can control geospatial precision in the following ways:

    • Specify float or double precision when creating a geospatial point or region index. This determines the precision of values stored in the index. For details, see Determining the Best Precision for Your Application.
    • Configure an App Server default precision. This specifies the precision to use in geospatial queries and computations when no other precision override is in effect. The default is single (float) precision. For details, see Specifying the App Server Default Coordinate System.
    • In XQuery, you can specify a default precision for a main module. This overrides the App Server default precision. For details, see Specifying the Per-Module Coordinate System.
    • Specify precision on an operation, such as when constructing a geospatial query, computing a distance, or accessing the coordinates of a box. This overrides the App Server and module default precision. For details, see Specifying a Per-Operation Coordinate System and Precision.

    You can specify precision in conjunction with the coordinate system name in most MarkLogic geospatial interfaces. For example, the wgs84 and raw coordinate system names imply single precision, while the wgs84/double and raw/double coordinate system names specify double precision.

    How MarkLogic Selects the Governing Coordinate System

    When MarkLogic evaluates your XQuery or Server-Side JavaScript code, the governing coordinate system is the first of the following settings found. For more details, see The Governing Coordinate System.

      Per-operation coordinate system option or parameter Per-module coordinate system, as specified by the XQuery xdmp:coordinate-system prolog option. (This feature is only available in XQuery main modules.) App Server default coordinate system

    If you specify a precision using the precision option of an operation, the specified precision always takes precedence over the precision implied by the governing coordinate system name.

    The following examples illustrate the governing coordinate system applied in several calling contexts if the App Server default coordinate system is wgs84.

    See the following topics for instructions on setting the coordinate system and precision at various levels:

    Probing the Governing Coordinate System Name

    You can probe the governing coordinate system using the XQuery function geo:default-coordinate-system or the JavaScript function geo.defaultCoordinateSystem. (This function can only account for the App Server default and per-module settings.)

    The following examples illustrate how to retrieve the name of the governing coordinate system.

    Language Example
    XQuery
    Server-Side JavaScript

    Specifying the App Server Default Coordinate System

    You can use the following Admin library functions to set and get the default coordinate system/precision combination for an App Server. If you do not explicitly set the coordinate system and precision, it is wgs84 (single precision).

    For example, the following XQuery code sets the default coordinate system for the App Server named MyAppServer to wgs84/double.

    You can also use the XQuery Admin library module from Server-Side JavaScript. The following example is equivalent to the previous XQuery code.

    To determine the canonical name for a coordinate system/precision combination, use the XQuery function geo:coordinate-system-canonical or the JavaScript function geo.coordinateSystemCanonical.

    Specifying the Per-Module Coordinate System

    In XQuery, you can use the xdmp:coordinate-system prolog option to override the App Server default coordinate system module-wide. This option is only available in XQuery. For example:

    The override only takes effect when you declare the option in an XQuery main module, but it affects any library module functions subsequently invoked from that main module.

    REST, Java, and Node.js Client API resource extensions are library modules, so you cannot override the coordinate system in your extension implementation. Use the ad-hoc query (eval or invoke) of the Client APIs or a per-operation override if you need to override the App Server default with these APIs.

    For example, if you create a library function that just returns the result of calling geo:default-coordinate-system, then the following main module will return wgs84/double for the coordinate system.

    Specifying a Per-Operation Coordinate System and Precision

    Many geospatial operations accept options for specifying the coordinate system and/or precision. A per-operation specification overrides the governing coordinate system. For example:

    Language Example
    XQuery
    Server-Side JavaScript

    Where both the coordinate-system and precision options are supported, you can specify the precision either as part of the coordinate system canonical name or independently. Where there is a conflict between the precision in the coordinate system name and the precision option, the precision option takes precedence.

    The following example illustrates how the option settings interact:

    Options Resulting Coordinate System
    wgs84 (single precision)

    You can get the canonical name for a coordinate system/precision combination using the XQuery function geo:coordinate-system-canonical or the JavaScript function geo.coordinateSystemCanonical.

    Specifying Coordinate System During Index Creation

    You can choose single or double precision when creating a geospatial index. This determines the precision of the values stored in the index. For example, you can create a single precision index over your geospatial data even if the data in your documents is double precision.

    Specify precision during index creation through the coordinate system name. You can use the XQuery function geo:coordinate-system-canonical or the JavaScript function geo.coordinateSystemCanonical to generate the canonical name of the desired coordinate system and precision combination.

    For example, the following code creates a geospatial element index for double precision wgs84 point values:

    Language Example
    XQuery
    Server-Side JavaScript

    Understanding Tolerance

    Tolerance is the largest allowable variation in geometry calculations. Tolerance is a distance within which two points are considered equal, a point is consider on an edge, or two edges are considered touching. Many geospatial functions in MarkLogic accept a tolerance option.

    See the following topics for more details:

    How Tolerance Affects Geometric Comparisons

    Tolerance defines the largest acceptable error when comparing two points for equality.

    For example, a tolerance of zero means two points only match if they're exactly the same, out to the least significant digit. Thus, two points separated by a distance measurable in microns would not match. If you're trying to determine whether a truck is parked at the door of a building, such a high degree of precision is a hindrance. Use tolerance to filter out differences that are in the noise.

    The following diagram illustrates how tolerance affects point comparison. A, B, and C are points. The shaded circle describes the space within which points are considered equal to A, based on the tolerance. Point B falls within tolerance of A, so A and B are considered equal. Point C is further from A than the tolerance allows, so A and C are not considered equal.

    When comparing edges, a point is considered as lying on the edge if the distance from the point to the edge is within the tolerance.

    For example, in the following diagram, any point in the green circles coincides with an endpoint of the edge. Any point within the pink region lies on the edge. Thus, point A coincides with an endpoint and point B lies on the edge. Point C is outside the the tolerance range, so it is not considered to lie on the edge.

    Operations such as computing whether two polygons intersect require comparing two edges. Two edges overlap if both endpoints of one edge lie on the other, or if an endpoint of each edge lies on the other.

    For example, the following diagram illustrates the effect of two different tolerance values on determining overlap. The green circles represent the tolerance of each endpoint. With the smaller tolerance, the edges do not overlap. With the larger tolerance, both endpoints of one edge are within tolerance of the other edge, so the edges overlap.

    Considerations for Tolerance Selection

    If you do not explicitly set tolerance, MarkLogic uses the default tolerance appropriate for the coordinate system.

    To ensure accuracy, MarkLogic enforces a minimum tolerance for each coordinate system. If tolerance is too precise, then the calculation of distance might not be accurate to the specified level of precision.

    You cannot choose a tolerance value less than zero.

    For most operations, MarkLogic interprets a tolerance of zero as the minimum tolerance for the coordinate system. The only exceptions are the XQuery function geo:bounding-boxes and the JavaScript function geo.boundingBoxes, as follows:

    When computing bounding boxes, a non-zero tolerance causes the bounding boxes to be padded by the tolerance amount. This ensures the bounding box covers the thickened boundary of the region under consideration. If you set tolerance to zero when computing bounding boxes, then the bounding boxes are not padded at all.

    When considering a polygon, tolerance effectively thickens the boundary of the polygon. If you set the tolerance too high relative to the size of the polygon, the polygon degenerates. This can result in unexpected results or errors.

    You can use the XQuery geo:region-approximate or the Server-Side JavaScript function geo.regionApproximate to simply your region(s) before performing geometric computations. The simplification can sometimes help you balance tolerance against polygon degneration.

    Geospatial computational and comparison operations that do not accept a tolerance option behave as if tolerance is set to zero.

    Summary of Other Geospatial Operations

    The following APIs are used to perform various operations and calculations on geospatial data:

    Converting To and From Common Geospatial Representations

    MarkLogic provides interfaces for converting between MarkLogic geospatial primitive types and several common geospatial text, XML, and JSON representations. This section covers the following topics:

    Conversion Overview

    You can use MarkLogic APIs to convert to and from the following common geospatial representations:

    For example, the following XQuery code uses geogml:parse-gml to convert a GML region into a cts:region (a polygon in this case). This function determines the output cts:region type from the kind of input GML region.

    If you know the input region type, you can also use one of the region-specific constructors to perform the equivalent conversion. For example, the above code could use geogml:polygon instead of geogml:parse-gml. For details, see Constructing Geospatial Point and Region Values.

    The following Server-Side JavaScript code converts a GeoJSON polygon into a cts.polygon:

    For each format, the XQuery API includes a parse- format function for converting from the common representation to a MarkLogic geospatial primitive type, and the JavaScript API includes a parse Format function for the same purpose. This operation is equivalent to calling the geo:parse XQuery function or the geo.parse JavaScript function with input of the same format. The API also includes a to- format XQuery function and to Format JavaScript function for converting from a MarkLogic primitive type to the target format.

    For example, the GeoJSON library module includes the following functions that can be used to convert data between GeoJSON and cts:region .

    You can use the built-in geo:parse XQuery function or geo.parse JavaScript function to convert nodes in any of the supported formats into an equivalent MarkLogic geospatial primitive type, without regard to the input format or region type. For best performance, if you know the format, use the equivalent format-specific functions.

    WKT and WKB Conversions in XQuery

    MarkLogic represents geospatial data using the cts:region type and types derived from it, such as cts:point, cts:polygon, and cts:circle. You can convert from WKT or WKB into cts:region items and from cts:region into WKT or WKB.

    Use the geo:parse-wkt function to convert WKT data into a sequence of cts:region items. Similarly, use geo:parse-wkb to convert WKB data into a sequence of cts:region items. You can use the resulting items in geospatial cts:query constructors or geospatial operations.

    For example, the following call converts a WKT polygon with an inner and outer boundary into a cts:complex-polygon:

    The input to geo:parse-wkb is a binary node that contains a WKB byte sequence. For example, the following code converts a WKB byte sequence representing the coordinates (-73.700380647, 40.739754168) into a cts:point:

    To convert from cts:region to WKT, use geo:to-wkt. For example, the following code returns a WKT POINT :

    Similarly, the following code returns a WKB POINT :

    You cannot convert a cts:circle or a cts:box to WKT. For more details on WKT, see http://en.wikipedia.org/wiki/Well-known_text.

    WKT and WKB Conversions in JavaScript

    MarkLogic represents geospatial data using the cts.region type and types derived from it, such as cts.point, cts.polygon, and cts.circle. MarkLogic provides the following conversions between WKT or WKB and cts.region : You can use the cts.region representation cts:query constructors or geospatial operations.

      Explict conversion from WKT or WKB to cts.region using the geo.parseWkt function. For example: Explicit converstion from cts.region to WKT or WKB using the geo.toWkt or geo.toWkb functions. For example: Implicit conversion from WKT to cts.region where the expected type is a cts.region . For example:

    Note that geo.parseWkt and geo.toWkt return a Sequence rather than an array or a single value. The input to geo.parseWkb is a binary node that contains a WKB byte sequence.

    The supported conversions from WKT to cts.region mean all the following calls pass the same cts.polygon value to geo.polygonContains, which returns true.:

    You cannot convert a cts.circle or a cts.box to WKT. For more details on WKT, see http://en.wikipedia.org/wiki/Well-known_text.

    Mapping of WKT and WKB Types to MarkLogic Types

    The following table shows how the WKT and WKB types map to the MarkLogic geospatial types. That is, the equivalent value type resulting from calling the geo:parse-wkt XQuery function or geo.parseWkt JavaScript function, or the WKB equivalents.

    WKT/WKB Geometry MarkLogic XQuery Type MarkLogic JavaScript Type
    POINT cts:point cts.point
    POINT EMPTY (WKT only) cts:point(flagged as empty) cts.point(flagged as empty)
    POLYGON cts:complex-polygon| cts:polygon cts.complexPolygon| cts.polygon
    POLYGON EMPTY cts:complex-polygon(flagged as empty) cts.complexPolygon(flagged as empty)
    LINESTRING cts:linestring cts.linestring
    LINESTRING EMPTY cts:linestring(flagged as empty) cts.linestring(flagged as empty)
    TRIANGLE cts:polygon cts.polygon
    TRIANGLE EMPTY cts:complex-polygon(flagged as empty) cts.complexPolygon(flagged as empty)
    MULTIPOINT cts:point* zero or more cts.point nodes
    MULTIPOINT EMPTY () an empty Sequence
    MULTILINESTRING cts:linestring* zero or more cts.linestring
    MULTILINESTRING EMPTY () null, empty array, or empty Sequence
    MULTIPOLYGON (cts:polygon| cts:complex-polygon)* (cts.polygon| cts.complexPolygon)*
    MULTIPOLYGON EMPTY () null, empty array, or empty Sequence
    GEOMETRYCOLLECTION cts:region * zero or more cts.region nodes
    GEOMETRYCOLLECTION EMPTY () null, empty array, or empty Sequence
    others throws XDMP-BADWKT throws XDMP-BADWKT

    Constructing Geospatial Point and Region Values

    Use the following APIs to construct geospatial regions. You can use the resulting region values in geospatial query constructors and other geospatial operations, such as those listed in Summary of Other Geospatial Operations.

    These constructors accept either the raw data, such as a pair of float values for constructing a point, or a string representing the serialization of the underlying primitive type. The serialized representation can be either the MarkLogic internal representation, such as a serilaized cts:point, or a WKT serialization. If the primitive is not constructible from the string input, an exception is thrown.

    Each constructor produces a region value of the corresponding primitive type. For example, the cts:box constructor function creates a value of type cts:box. Each of the geospatial primitive types is an instance of the cts:region base type ( cts.region in JavaScript).

    For example, the following call constructs a cts:polygon from a string that is a serialized cts:point value (space-separated points):

    You can also construct the primitive types from XML or JSON nodes that contain geospatial data in the supported formats. For example, the following XQuery code uses the geokml:box function to construct a cts:box from an XML element containing a KML LatLongBox.

    Similarly, the following example uses geojson.box JavaScript function to construct a cts.box from a JSON node that contains a suitable GeoJSON polygon. For example:

    Geospatial Query Support in Other APIs

    The Search API enables geospatial queries through the following features:

    • Define geospatial constraints using query options such as geo-elem-pair-constraint , geo-path-constraint , and geo-json-property-constraint . For details, see search:search and Search Customization Using Query Options.
    • Create geospatial structured queries using composers such as geo-elem-query and geo-json-property-pair-query . For details, see Searching Using Structured Queries.
    • Add a heatmap to a geospatial point constraint to generate geospatial facets. For an example, see Geospatial Constraint Example.

    For information on specific geospatial query options, see Appendix: Query Options Reference.

    The Client APIs for REST, Java and Node.js applications provide similar support. For more details and example see the following topics:

    Preparing to Run the Examples

    Use the instructions in this section to load the data and configure the indexes used in several examples in this chapter. The following topics are covered:

    Overview of the Sample Data

    The sample data contains points, linestrings, and polygons associated with landmarks near the MarkLogic headquarters. The following diagram approximates the relative positions of the features in the sample data.

    This geospatial data is made available in two formats: KML (XML) and GeoJSON. Each document describes a single point or a region. The points and regions are the same in the two types of documents (XML and JSON). For example, the documents /geo-examples/Airport.xml and /geo-examples/Airport.json describe the same region.

    The documents are added to the following collections to make it easy to select the type of data to work with just XML, just JSON, or both formats.

    Document Set Collections
    KML geo-examples, geo-xml-examples
    GeoJSON geo-examples, geo-json-examples

    Each document uses the envelope pattern to encapsulate the original KML or GeoJSON region coordinates with a serialized cts region that is suitable for use with cts:geospatial-region-query. The unprocessed KML input is a single XML file that contains a series of KML Placemark elements. The following data snippet shows the structure of the raw input:

    The ingestion process splits the input into one document per Placemark . The envelope pattern is used to encapsulate the original Placemark with an equivalent serialized cts:region if the Placemark represents a non-point region. Region queries only operate on regions expressed as WKT or serialized cts:regions, so you cannot query the Placemark coordinates directly. (Point regions are left untranslated for convenience in demonstrating point queries you could choose to treat them the same way.)

    The following examples shows the final document format, with the envelope root element and the cts-region element created by the ingest transformation. If a Placemark represents a point, then no cts-region element is added because it is not needed.

    The GeoJSON data receives similar treatment. The raw input is a feature collection. Ingestion creates one document per feature. The envelope pattern is used to encapsulate each feature with an equivalent serialized cts:region to facilitate queries. Point regions are not transformed.

    For example, the raw GeoJSON input has the following structure:

    Ingestion produces documents of the following form. For documents containing a region, the ingestion transformation adds the envelope wrapper and a ctsRegion . For documents containing a point, the ingestion transformation just adds the envelope wrapper.

    You can thus use region queries on /envelope/cts-region (XML) or /envelope/ctsRegion (JSON) and point queries on /envelope/kml:Placemark/Point/coordinates (XML) or geometry[type = 'Point']/array-node('coordinates') (JSON).

    Configuring the Indexes

    This section walks through configuring a point index and a region index over the XML and JSON samples documents, for a total of 4 indexes. Separate indexes are used for each data set to showcase a variety of indexes and to make it easy to focus on one content type or the other.

    Point indexes are optional for some types of geospatial point queries, but required for geospatial point range queries and lexicon operations. An index is usually recommended for best performance.

    Geospatial region queries always required an index.

    You can skip over indexes related to content that does not interest you. For example, you can skip the XML-related indexes if you are only interested in JSON. However, some examples in this chapter may not work properly without the related indexes.

    Choose one of the following methods to create the indexes.

    You can also create indexes using the REST Management API. This method is not included here. For details, see the MarkLogic REST API Reference.

    Creating Indexes Using the Admin Interface

    The following table summarizes the configuration characteristics of the indexes you will create in the Admin Interface. Use this information in Steps 5 and 7 of the following procedure. Use the default value for any characteristic not specified here.

    Parent namespace URI: http://www.opengis.net/kml/2.2

    Child namespace URI: http://www.opengis.net/kml/2.2

    Child localname: coordinates

    Point format: long-lat-point

    Path expression: geometry[type = 'Point']/array-node('coordinates')

    Point format: long-lat-point

    Path expression: /envelope/cts-region

    Path expression: /envelope/cts-region

    Use the following procedure to create the geospatial indexes using the above configuration information. For more information about the Admin Interface, see Administrator's Guide.

      Navigate to the Admin Interface in your browser. For example, navigate to http://localhost:8001 if your MarkLogic installation is on localhost. Authenticate as a user with administrative privileges. Click Databases in the tree menu on the left to expand the list of databases. The tree menu expands to display the available databases. Click the name of the database for which you want to create an index. For example, click Documents. The tree menu expands to display the configuration categories for this database. Click Geospatial Indexes icon in the tree menu, under the selected database. To create a point index:
        Click Geospatial Point Indexes in the tree menu, under the selected database. Click the type of point index you want to create. For example, click Geospatial Element Child Indexes. The configuration page for this index type is displayed on the right.
      Creating the Indexes with XQuery

      Use the procedure in this section to create the indexes using XQuery and Query Console. If you are not familiar with Query Console, see the Query Console User Guide. For equivalent JavaScript instructions, see Creating the Indexes with JavaScript.

      The following procedure creates two point indexes and two region indexes.

        Navigate to Query Console in your browser. For example, navigate to http://localhost:8000 if your MarkLogic installation is on localhost. Copy the following code into a new query tab in Query Console. If you are not using the Documents database for your content database, modify the value of the $database variable. Select XQuery in the Query Type dropdown if it is not already selected. Click the Run button in Query Console to create the indexes.

      The script produces no output when it is successful. You can use the Admin Interface to explore the geospatial indexes and confirm the indexes were created.

      Creating the Indexes with JavaScript

      Use the procedure in this section to create the indexes using Server-Side JavaScript and Query Console. If you are not familiar with Query Console, see the Query Console User Guide. For equivalent XQuery instructions, see Creating the XML Input File.

      The following procedure creates two point indexes and two region indexes.

        Navigate to Query Console in your browser. For example, navigate to http://localhost:8000 if your MarkLogic installation is on localhost. Copy the following code into a new query tab in Query Console. If you are not using the Documents database for your content database, modify the value of the database variable. Select JavaScript in the Query Type dropdown if it is not already selected. Click the Run button in Query Console to create the indexes.

      The script produces no output when it is successful. You can use the Admin Interface to explore the geospatial indexes and confirm the indexes were created.

      Creating the Input Data Files

      Follow the instructions in this section to create two files containing the raw XML and JSON sample data. These files are used by the procedure in Loading the Sample Data. Create both files, unless you plan to skip the examples involving one document format.

      Creating the XML Input File

      Copy the following data to a file on the filesystem. Choose a location that is readable by your MarkLogic installation. You can use any file name, but the subsequent instructions assume geo-examples.xml.

      Next, create the JSON data file following the instructions in Creating the JSON Input File.

      Creating the JSON Input File

      Copy the following data to a file on the filesystem. Choose a location that is readable by your MarkLogic installation. You can use any file name, but the subsequent instructions assume geo-examples.json.

      After saving the data to a file, load the sample data into the database using the instructions in Loading the Sample Data.

      Loading the Sample Data

      The procedures in this section load the raw data from Creating the Input Data Files into documents of the form discussed in Overview of the Sample Data.

      This section uses XQuery to load the XML documents and Server-Side JavaScript to load the JSON documents. You could use either language to load both, but the XML transformations flow more naturally in XQuery, while the JSON transformations flow more naturally in JavaScript.

      You can load either or both data sets, but some examples in this chapter will not work if you do not load both. You do not need to be familiar with either XQuery or JavaScript to follow the instructions in this section.

      Loading the XML Sample Data

      The procedure in this section uses XQuery to load the sample data because it is easier to do XML transformations using XQuery. You do not need to be familiar with XQuery to follow this procedure.

      Before you begin, you should have completed the steps in Creating the Input Data Files.

        Open the Query Console tool in your browser. For example, if MarkLogic is installed on localhost, navigate to the following URL: http://localhost:8000 . Copy the following query into a new query tab in Query Console:
      1. Modify the query to set the value of the $INPUT-FILE variable to the absolute path to the file containing the raw XML input data. This is the file you created in Creating the Input Data Files. Choose XQuery in the Query Type dropdown list. Choose the database into which you want to insert the documents in the Database dropdown list. For example, choose the Documents database. Click the Run button to evaluate the query and create documents in the database. Optionally, click the explorer icon to the right of the Database dropdown to explore the database contents and examine the new documents.

      If the query is successful, the following documents are created. All the documents have a /geo-examples/ directory prefix and are in collections named geo-xml-examples and geo-examples.

        /geo-examples/Airport.xml /geo-examples/Holly-St.xml /geo-examples/Hwy-101.xml /geo-examples/MarkLogic-HQ.xml /geo-examples/MarkLogic-Neighborhood.xml /geo-examples/Museum.xml /geo-examples/Restaurant.xml /geo-examples/Shopping-Center.xml /geo-examples/Wildlife-Refuge.xml

      For more information about the data, see Overview of the Sample Data.

      Next, load the JSON sample documents using the instructions in Loading the JSON Sample Data.

      Loading the JSON Sample Data

      The procedure in this section uses Server-Side JavaScript to load the sample data because it is easier to do JSON transformations using JavaScript. You do not need to be familiar with JavaScript to follow this procedure.

        Open the Query Console tool in your browser. For example, if MarkLogic is installed on localhost, navigate to the following URL: http://localhost:8000 . Copy the following query into a new query tab in Query Console:
      1. Modify the query to set the value of the $INPUT-FILE variable to the absolute path to the file containing the raw input data. This is the file you created in Creating the Input Data Files. Choose JavaScript in the Query Type dropdown list. Choose the database into which you want to insert the documents in the Database dropdown list. For example, choose the Documents database. Click the Run button to evaluate the query and create documents in the database. Optionally, click the explorer icon to the right of the Database dropdown to explore the database contents and examine the new documents.

      If the query is successful, the following documents are created. All the documents have a /geo-examples/ directory prefix and are in collections named geo-json-examples and geo-examples.

        /geo-examples/Airport.json /geo-examples/Holly-St.json /geo-examples/Hwy-101.json /geo-examples/MarkLogic-HQ.json /geo-examples/MarkLogic-Neighborhood.json /geo-examples/Museum.json /geo-examples/Restaurant.json /geo-examples/Shopping-Center.json /geo-examples/Wildlife-Refuge.json

      For more information about the data, see Overview of the Sample Data.

      Your database is now properly configured to run the examples in this chapter.


      Try it

      Try object detection and localization below. You can use the image specified already ( https://cloud.google.com/vision/docs/images/bicycle_example.png ) or specify your own image in its place. Send the request by selecting Execute.

      Image credit: Bogdan Dada on Unsplash.

      Request body:

      Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.


      Watch the video: Πως μπορώ να κρύψω ένα αρχείο σε ubuntu