spatial blogging ... and so

Sunday, April 22, 2012

a little bit about Oracle Quote Operator and Substitution Variables

keywords: oracle quote operator, substitution variables, set define, string, ampersand, global temporary table

I would like to share in this post one Oracle functionality that might save you some of your time. If you write PL/SQL code or if you are programming in some other language sql statements most probably you will have to write to database textual data that contain quotes or you will have to create SQL statement that inserts varchar into database. In this case or other similar cases you will have to struggle with lot of quote signs. To circumvent all that trouble try to use Oracle quote operator.

q'<delimiter><string><delimiter>';

Here is how it works:

declare
   l_str varchar2(30) := q'[I’ll quote!]';
begin
   dbms_output.put_line(l_str);
end;

or just simple select

select q'[i’ll quote!]' from dual;

Last time I used this Oracle functionality introduced in Oracle 10g I had to store in database many predefined CQL filters and http requests GetLegendGraphic (the intention was to store configuration for the Web GIS application in database; Geoserver was GIS engine used on server side). Besides CQL and Legend elements I had to store some other configuration elements in database but for this post I would like to present you one more thing that can be helpful working with strings... especially if you are storing url in database table.
Here is example of url I had to store in database.

../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20
&HEIGHT=20&LAYER=PP:COUNTY

It is standard GET request that contains ampersand signs. To demonstrate what is so special with GET request I have prepared for you several SQL statements.... so.... open your SQL plus and try to run it.... First create table for testing, than try to run script 1 and than script 2... after testing remove table....

--to create table:
create global temporary table my_test_table (
 test_string  varchar2(1000)
) on commit preserve rows;

--to remove table:
truncate table my_test_table;
drop table my_test_table;

--script 1
set define off;
insert into  my_test_table
values (q'[../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20&HEIGHT=20&LAYER=PP:COUNTY]');
commit;

Try to execute the same statement but now with option set define on.

--script 2
set define on;
insert into  my_test_table
values (q'[../geoserver/wms?REQUEST=GetLegendGraphic&VERSION=1.0.0&FORMAT=image/png&WIDTH=20&HEIGHT=20&LAYER=PP:COUNTY]');
commit;

You will notice difference immediately :-)
… and If you didn’t know about oracle substitution variables maybe now you can find some interesting application for that functionality …..

On this link you can read more about it.

… enjoy …

Saturday, April 21, 2012

GOIDGenerator & CRCCalculator

keywords: Feature Manipulation Engine (FME), GOIDGenerator, CRCCalculator, Coordinate System, identifier based on location and/or geometry

...reading only caption of this post some people will immediately recognise that this post is about FME...

In the case you want to create identifyer based on the point location (coordinates) you can use GOIDGenerator or CRCCalculator transformer. If you are using GOID you should create your ID based on the first 16 characters of the generated GOID string. Later in this post I’m referencing to GOID in its shortened meaning (first 16 characters).

If you are using CRC you should take into count only geometry not attributes (Attributes to use in CRC calculation shuld be empty).

Solutions using GOID and CRC should generate the same result but in some cases it is not so. To make demonstration I have created sample workbench file.

At the beginning of the translation process I have created geometry object (point) and created GOID and CRC attribute based only on position or geometry (attributes goid_1 and _crc_1) after that I have assigned coordinate system to geometry object and repeated generation of GOID and CRC. This time values are stored in different named attributes (goid_2 and _crc_2) . At the end of the process two testers checks equality of generated GOIDs and generated CRCs. In the case of GOID the attribute value changed after CoordinateSystemSetter transformer what wasn’t case for CRC values.

Take that into account when you compare data that are coming from two different sources, sometimes the coordinates can be exact the same but omission of coordinate system definition can mislead you if you are relying on GOID.

Let me describe to you case that drove me to explore more in detail GOIDGenerator transformer. Here is conceptual schema.

Data in database do not require reprojection because they are in official coordinate system. New data are imported from csv file and after geometry creation (2DPointReplacer) data were transformed into official coordinate system. Feature merger should detect points that didn’t change coordinates. Detection wasn’t possible because branch with data coming from Oracle Spatial didn’t have coordinate system attached. The solution to this case was simple: "attach CS to the Oracle Spatial Reader" … sometimes when you are debugging workbench process such solution is not so obvious and you have to make number of tests to find out what seems to be issue.

To avoid coordinate system influence on GOID generation you can use following workflow:

store CS to attribute
remove CS
generate GOID
set CS based on attribute value

… or you can use CRCCalculator...

There are always many solution but keep in mind that coordinate system information is in some way stored in GOID and not in CRC... sometimes you need that information and sometimes you don’t.

... enjoy ...

Check Your Oracle Spatial Tables

keywords: oracle, spatial tables, spatial index, metadata, validity check

Here are few helpful SQL statements. You can use it when you migrate or create new Oracle database with spatial tables.

In the case when select statement return some rows there is something wrong with your spatial tables. First statement checks for missing metadata, second checks missing spatial indexes and the third checks index validity.

Keep in mind that these queries sometimes are not sufficient to prove that everything is OK with your spatial database or spatial data definition. For example you can have invalid entries in tables related to the definition of spatial reference system (if you are using custom SRID for your spatial data).

Anyway, these posted queries can help you in many cases to detect errors related to migration or creation of spatial tables. The check is performed for all database users (schemas). If you have some restriction, run these queries as a system or sys user.. or some other user that have privileges to execute select statements on all required tables.

… enjoy ...

-- to get all spatial tables without entry in user_sdo_metadata view
select geomet.sdo_owner, geomet.sdo_table_name, geomet.sdo_column_name,
    atbl.owner, atbl.table_name, atbl.column_name
    from mdsys.sdo_geom_metadata_table geomet, all_tab_cols atbl   
        where geomet.sdo_owner       (+)  = atbl.owner
        and geomet.sdo_table_name    (+)  = atbl.table_name
        and geomet.sdo_column_name   (+)  = atbl.column_name
        and geomet.sdo_owner              is null
        and geomet.sdo_table_name         is null
        and geomet.sdo_column_name        is null
        and atbl.data_type_owner       =  'MDSYS'  
        and atbl.data_type             =  'SDO_GEOMETRY'
        and atbl.owner                != 'MDSYS';

-- to get all missing spatial indexes
select alic.index_owner, alic.table_name, alic.column_name, 
    atbl.owner, atbl.table_name, atbl.column_name
    from  all_ind_columns alic, all_tab_cols atbl
        where alic.index_owner     (+)  = atbl.owner
        and   alic.table_name      (+)  = atbl.table_name
        and   alic.column_name     (+)  = atbl.column_name
        and   alic.index_owner          is null
        and   alic.table_name           is null
        and   alic.column_name          is null
        and   atbl.data_type_owner   =  'MDSYS'  
        and   atbl.data_type         =  'SDO_GEOMETRY'
        and   atbl.owner            !=  'MDSYS';

--to check spatial index validity
select alin.owner, alin.table_name, alin.status, 
    alin.domidx_status, alin.domidx_opstatus
    from  all_indexes alin
        where (alin.index_type        = 'DOMAIN'  
           and alin.ityp_name         = 'SPATIAL_INDEX' 
           and alin.table_name not like '%$%')
        and   (alin.status           != 'VALID' 
           or  alin.domidx_status    != 'VALID' 
           or  alin.domidx_opstatus  != 'VALID');

Friday, April 13, 2012

Reserved Characters in HTML: google-code-prettify

keywords: html reserved characters, google-code-prettify, xml

...just a small tip if you use google-code-prettify...

If you want to publish some code that includes xml tags or generaly speaking if your code contains some reservd characters in HTML like operators lighter-than and greater-than, you have to make replacement: insted of < you shoul put < and insted of > put > …

For example, if you want to prettify code:

    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "<style>\n";
        xmlTemplate = xmlTemplate + "  <id>:style_id:</id>\n";
        xmlTemplate = xmlTemplate + "  <name>:style_name:</name>\n";
        xmlTemplate = xmlTemplate + "  <sldVersion>\n";
        xmlTemplate = xmlTemplate + "    <version>1.0.0</version>\n";
        xmlTemplate = xmlTemplate + "  </sldVersion>\n";
        xmlTemplate = xmlTemplate + "  <filename>:style_filename:</filename>\n";
        xmlTemplate = xmlTemplate + "</style>";
        
        return xmlTemplate;
    }

you shold enter in the HTML page:

<pre class="prettyprint linenums lang-java">

    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "&lt;style&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;id&gt;:style_id:&lt;/id&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;name&gt;:style_name:&lt;/name&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;sldVersion&gt;\n";
        xmlTemplate = xmlTemplate + "    &lt;version&gt;1.0.0&lt;/version&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;/sldVersion&gt;\n";
        xmlTemplate = xmlTemplate + "  &lt;filename&gt;:style_filename:&lt;/filename&gt;\n";
        xmlTemplate = xmlTemplate + "&lt;/style&gt;";        

        return xmlTemplate;
    }
</pre>

and the result shold look like this:

    private  String getStyleTemplate(){
        String xmlTemplate ="";
        xmlTemplate = xmlTemplate + "<style>\n";
        xmlTemplate = xmlTemplate + "  <id>:style_id:</id>\n";
        xmlTemplate = xmlTemplate + "  <name>:style_name:</name>\n";
        xmlTemplate = xmlTemplate + "  <sldVersion>\n";
        xmlTemplate = xmlTemplate + "    <version>1.0.0</version>\n";
        xmlTemplate = xmlTemplate + "  </sldVersion>\n";
        xmlTemplate = xmlTemplate + "  <filename>:style_filename:</filename>\n";
        xmlTemplate = xmlTemplate + "</style>";        

        return xmlTemplate;
    }

I hope this helps someone.

To get exact result as in this page add custom css to your template in blogger, and use default prettify theme.

li.L0, li.L1, li.L2, li.L3, li.L4, li.L5, li.L6, li.L7, li.L8, li.L9
{
    color: #555;
    list-style-type: decimal;
}

...enjoy

Tuesday, April 10, 2012

Overcome performance issues using FME command line and Java

keywords: command line, FME, large datasets, database, SQL, Java

In this post I’ve described procedure how to use FME command line functionality to overcome some performance issue with large datasets.

… just a few words about FME command line ….

FME workbench can be played using FME workbench too or simply by using command line. You can see exact command in the translation log window.

fme.exe my_workbench.fmw
         --SourceDataset_CSV C:\my_data.txt
         --DestDataset_XLS_ADO C:\my_export.xls

First parameter is command (starts FME engine), second parameter is workbench name, after that can be arbitrarily number of parameters as it is defined in workbench file (user published parameters) but some of them are published by default like source or destination dataset (check for FME documentation for more details).
In this sample command the name of source dataset parameter for csv file is SourceDataset_CSV and the name of destination dataset parameter for xls file is DestDataset_XLS_ADO.

Create your workbench file and play it. Examine your own translation log window and see what parameters are expressed in command line. That is going to be your starting point for optimisation.

I don’t want to say that using proposed procedure you can solve all performance issues but some of them you will surely overcome, especially if your task has similarities with my case.

… the case …

I have to calculate mean, median and mode height for all Croatian LPIS parcels (land parcel information system). The source data were:

height data: dgn files for all country (DTM: points and breaklines; accuracy: +/- 2m). To get more feeling on the data size you can immagine height grid of 25m x 25m that covers area of 56.542square km.
LPIS polygons (number of polygons > 1 000 000)

In the workbench file I had to overlay all LPIS polygons and DTM and than I had to calculate mean, median, mode for every parcel as well as accuracy. The procedure to solve this task is simple but to do calculation for whole country it takes a lot of time. The conceptual workflow is presented on the next figure.

From my point of view if you are working with large datasets and large number of files it is always good idea to upload your data to the database. In fact, this post assumes that you have access to some database system.

I work always in that way, from experience I can say that database will save you time and that conversion from file to database system is worth effort. In my work I use mostly Oracle database (sometimes PostGIS). For the sake of convenience, if you have to refer to original files later in process, during conversion from file to database system you may consider to add format attributes (fme_basename, fme_dataset, fme_feature_type).

Conversion from some file system to database system can be described by following schema.

You may consider some other approach depending on the source data format and data model. Proposed workflow is sufficient for CAD files (sometimes you have to expose color, style, weight and store it as attribute, just like other format attributes in step 2).

Let me go back to the first schema. Reading DTM files from file or even from the database can be time consuming. In the case of Croatian DTM I had to wait many hours (I don’t remember exactly but I’m speaking of period longer than 10 hours). Reading LPIS polygons wasn’t issue (step 2). Try to imagine how long will take to overlap all LPIS polygons and DTM, a how much RAM you should have to perform such task. It is not hard to see that such task can not be easily solved. In some cases it can be impossible to find solution without process redesign.

… redesign …

The goal is to make calculation (mean, median, mode for height) on the parcel level (one polygon). Parcel defines area of interest and DTM data provide height information. The workflow for successful problem solution can be described as follows.

Posted Java application creates two database view that play rule of the source dataset in the workbench file that is later executed using command line approach in the same application. To be honest this application can be written in many programing languages but for this particular task I decide to use Java. You just have to create workbench file that connects to database views. Here is the java code... (to run this application you need to have oracle jdbc driver).

package my.util.fme;

import java.sql.*;
import java.io.*;

public class CreateBatch {
 
  public static void main(String[] args) 
      throws ClassNotFoundException, SQLException, IOException
  {
    
 String fmeDat, datasource, username, password, fileName;
 String q1, q2, q3;
 
 Runtime rt;
 Process pr;
 int exitVal;
 String command;
 
 Process prcs;
 InputStreamReader isr ;
 BufferedReader br ;
 String line; 
 
 try{
  if(args.length==5){
   
   fmeDat = args[0];
   datasource = args[1];
   username = args[2];
   password = args[3];
   fileName = args[4];
   
   FileWriter fstream; 
   BufferedWriter out;
   
   Class.forName("oracle.jdbc.driver.OracleDriver");
   
      String url = "jdbc:oracle:thin:@oracle1.geofoto.lan:1521:" + datasource;
             
      Connection conn = 
           DriverManager.getConnection(url,username, password);
   
      conn.setAutoCommit(false);
   
      Statement stmt = conn.createStatement();
   Statement stmt1 = conn.createStatement();
   
      //step 1 in proposed workflow
   ResultSet rset =  stmt.executeQuery("select lp.id from lpis_parcels lp order by id");
   
      ResultSet rset1, rset2, rset3;
   
      while (rset.next()) {
   
    fstream = new FileWriter(fileName);
    out = new BufferedWriter(fstream);
        
    //create view with one LPIS parcel
    //step 1 in proposed workflow
    q1="create or replace view v_lpis_parcels as " +
    "select  lp.* from lpis_parcels lp where lp.id=" + rset.getString(1);
    
    //create view with DTM points that have anyinteract topological relation with current parcel -> rset.getString(1)
    //step 2 in proposed workflow
    q2="create or replace view v_dtm_points as " +
    "select dtm.* from dtm_points dtm where sdo_anyinteract(dtm.geom, select lp.geom from v_lpis_parcels) = 'TRUE'";
    
    //create view with DTM lines that have anyinteract topological relation with current parcel -> rset.getString(1)
    //step 2 in proposed workflow
    q3="create or replace view v_dtm_lines as " +
    "select dtm.* from dtm_lines dtm where sdo_anyinteract(dtm.geom, select lp.geom from v_lpis_parcels) = 'TRUE'";
    
    rset1 =  stmt1.executeQuery(q1);
    rset2 =  stmt1.executeQuery(q2);
    rset3 =  stmt1.executeQuery(q3);
    
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    command = "fme.exe " + 
         fmeDat + 
         " --SourceDataset_ORACLE8I " + 
         datasource +
         " --DestDataset_ORACLE8I " +
         datasource;  
    
    //create batch file: contains only one command; it is recreated in every iteration
    //probably it can be sloved without file creation but this works and I didn't find another way to execute fme.exe (command)
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    out.write(command);
    out.write('\n');
    out.close();
    
    //executed batch file created in previous step
    //step 3 & 4: in proposed workflow these steps are performed inside fme workbench file
    rt = Runtime.getRuntime();     
    prcs = rt.exec(fileName);   
    isr =  new InputStreamReader( prcs.getInputStream() );         
    br = new BufferedReader(  isr ); 
       
    while  ((line = br.readLine()) != null)
      System.out.println(line); 
      }
   
      stmt.close();
   stmt1.close();

  }else{
   System.out.println ("");
   System.out.println ("java CreateBatch     ");  
   System.out.println ("");
   System.out.println ("   FME workbench file: *.fmw");  
   System.out.println ("   Oracle Service Name: Ask Database Administrator!");  
   System.out.println ("   Username: Ask Database Administrator!");  
   System.out.println ("   Password: Ask Database Administrator!");  
   System.out.println ("   File Name: Name of batch file created during process");  
  }
 }
 catch(SQLException e){
 
  System.out.println("------------------------------");
  System.out.println("SQLException");
  System.out.println("------------------------------");
  System.out.println(e);
 }
 catch(ClassNotFoundException e){
  System.out.println("------------------------------");
  System.out.println("ClassNotFoundException");
  System.out.println("------------------------------");
  System.out.println(e);  
 }
 catch(IOException e){
  System.out.println("------------------------------");
  System.out.println("IOException");
  System.out.println("------------------------------");
  System.out.println(e);  
 }
  }
}

In the nutshell:

prepare FME workbench that connects to the database view/s (keep in mind that view/s are created on the fly using application)
using starter application (java, python, c++, ….) create database view/s with minimum data needed for translation process using FME
invoke FME using starter application (in the same run as previous step)
loop process until you solve task for whole dataset
store result of every iteration for later reporting and analysis

… to conclude …

Optimisation of described case is reached by breaking process into smaller units (working with one parcel at the time) and by the fact that the FME workbench doesn’t need to read whole dataset at once because it reads data from views that are created on the fly using posted java application. Java application glues two separate processes: data preparation (SQL) and execution of fme workbench (FME command line) as it is presented on the workflow diagram.

enjoy!

Thursday, April 5, 2012

Catalog (Inspect all attributes)

keywords: PL/SQL, Oracle, data catalog

this post is just generalisation of already posted procedure to create catalog from implemented database tables

…. Posted procedure inspect only one attribute (VRSTA) but I have found out later that there are valuable information stored in other attributes... Attribute names, description and domain were unknown. This time I had to make list of attributes and values in spatial tables so that I can easily decide based on attribute value what attributes might be interesting to customers so that I can include it in reporting or as info click functionality of web GIS application... yes, I know … the one should know in advance what attributes to include in application or report, unfortunately this is not my case :-) … based on report produced by posted PL/SQL procedure I just have to guess …

After you review report (created by posted procedure) you might find number of columns with garbage data. Just remove it using this SQL statement:

alter table table_name drop column column_name;

to drop multiple columns you can use:

alter table table_name drop (column_name1,column_name2);

… have fun!

declare
 cursor c_tablice is 
    select table_name t_name, column_name c_name from  all_tab_cols
       where owner = 'PPSV'  
       and data_type_owner is null -- to skip object types such as sdo_geometry
       and table_name not like '%$%'  -- to skip system values
       and column_name not like '%$%' -- to skip system values
       and column_name not like 'ID%'; -- to skip fields named ID%
       /*
         in my case all fields that follows pattern ID% are automatically generated (sequence) and they are not interesting for catalog;
         you can exclude some additional fields using the same principle
       */

       type vrsta_type is ref cursor;
       c_vrsta vrsta_type;
       vrsta varchar2(1000);
       v_sql varchar2(1000);
begin
for  r_tablice in c_tablice
loop
  dbms_output.put_line('> TABLE: '||r_tablice.t_name);
  v_sql := 'select distinct(' || r_tablice.c_name || ' ) d_vrsta from '|| r_tablice.t_name;
      open c_vrsta for v_sql ;
          loop
       fetch c_vrsta into vrsta ;
             exit when c_vrsta%NOTFOUND;
             dbms_output.put_line(r_tablice.c_name || ': '|| vrsta);
          end loop;
      dbms_output.put_line('');
      close  c_vrsta;
end loop;     
end;
/

Wednesday, April 4, 2012

Feature Manipulation Engine & Linear Referencing

keywords: FME, LRS (linear referencing system)

… a little bit something for FME addicts … my people :-)

Recently I have to to solve one simple task regarding linear referencing. It is common task for all systems where the coordinate system is related to the linear elements. Each feature location is expressed as a distance from the beginning of some line segment. For example of such systems I can point out road systems, pipeline systems and similar.

In this case that I’m writing about I had GPS data in csv format and road axis in the Oracle database. The GPS data represent location of the images (taken by the specially equipped vehicle for road measurements) along the road. To import images into application that animates ride down the road I had to express GPS coordinates for every image as a distance from the beginning of the road segment, ie. I had to establish LRS. For this particular task I’ve choose FME.

For better understanding I’ll break down the task into smaller activities (steps):
1. create point geometry from GPS data in csv format (if needed change the coordinate system)
2. prepare line geometry of selected road segment
3. move (map) measured GPS points to the nearest point on the road segment
4. calculate measure for all points on road segment (old and new points added in step 3)
5. join original attributes of GPS points (image ID needed for animation) with the calculated measure (step 4)

Here is the produced workbench file (FME 2012).

Unfortunately I can't post original data but if you have some additional questions feel free to ask. Maybe FME have some better solution for this task but I didn't have additional time to explore. The whole schema presented in separate steps is here.

Step 1:

2D point adder creates geometry from atribute data
Reprojector chenges coordinate system of GPS data to match coordinate system of road data
Counter creates ID on every point (I wonder why I put it here... just forget it ...)
Coordinate rounder removes unuseful decimals places on coordinates
tester removes points that are collected before measuring car entered discussed road

Step 2:

rounding the coordinates
length calculation (_uk value) using LengthCalculator :-)
calculation of scale factor (ExpressionEvaluator)
ratio between distance calculated from geometry data (_uk) and distance measured by odometer (STAC_ZAV) and stored in database … this calculation is not so important for this solution but it may be required
store ratio (_koef) as a variable using VariableSetter
remove height coordinate (2DForcer)

Step 3:

NeighborFinder finds closest point on road line for every gps point. Coordinates are stored as attribute values
2DPointReplacer takes previously calculated coordinates from attribute values and create geometry objects
custom transformer unique_geom creates unique identifier for every point based on point coordinates
overylay points on the line (PointOnLineOverlayer) and join generated line segments (LineJoiner)

Step 4:

calculate measure for all vertices (point) in line generated in previous step
break line feature into points (line vertices) using Chopper
create unique identifier for every point based on point coordinates (unique_geom)
extract generated measure (LR) an store it as attribute value (MeasureExtracor)
retrieve variable created in step 1 and store it as attribute value (VariableRetriever)
scale calculated measure values using variable value retrieved in previous step

Step 5:

using FeatureMerger merge calculated measure values (step 4) with the original attributes of GPS points. Unique identifier based on geometry is used for this task (custom transformer: unique_geom).
remove unnecessary attributes using AttributeRemover
store result into desired format (in this case it was xls)