Posted Dec 19 by Pete Oliver.
Updated Dec 21.

In this article is presented a walk-through configuration of the AppWorks Gateway Proxy. You will configure an AppWorks Gateway Proxy to expose a simple web page and a simple web-based remote REST service.

Last activity Dec 21 by Pete Oliver.
1827 views. 0 comments.

Configuring the OpenText AppWorks Gateway 16.1 Proxy

In this article is presented a walk-through configuration of the AppWorks Gateway Proxy. You will configure an AppWorks Gateway Proxy to expose a simple web page and a simple web-based remote REST service.

In this article the following features of the Proxy will be demonstrated

  • Whitelist composition
  • Blacklist entries for securing the proxied API
  • URL rewriting
  • Content rewriting

Prerequisites

  • AppWorks Gateway 16.1 or above. You can download it from here
  • A basic familiarity with the AppWorks Gateway Proxy, as covered by this article
  • A web browser. The example screen shots will show Google's Chrome browser in use, but any modern web browser will do
  • NetTool. This is a useful tool that allows us to construct HTTP requests, as well as tunnel requests from a client to server in order to inspect the traffic as it flows. Get NetTool from here

Introduction

The AppWorks Gateway Proxy is an integral component of the OpenText AppWorks Gateway. Its function is to provide a reverse HTTP proxy for web services and simple web applications, enabling the following:

  • Single Point of Entry
  • Enforcement of Same-Origin Policy
  • Securing Web Service APIs
  • Simple Web Service API Routing
  • Meta APIs
  • Content Rewriting

For an explanation of these terms and how they apply to the AppWorks Gateway Proxy, see this article.

AppWorks Gateway Proxy Configuration Page

The AppWorks Gateway Proxy is configured via its main configuration page, which can be navigated to by selecting the Proxy option from the AppWorks Gateway administration user interface:

AppWorks Gateway Proxy Configuration Page

A Note About the Examples Presented Here

The examples presented here assume you are using a browser running on the same computer as your AppWorks Gateway, and that the AppWorks Gateway is reachable on http://localhost:8080. Please adjust the URLs you use accordingly if your Gateway is running elsewhere on your network.

A Simple Proxy Rule

From the main Proxy configuration page, click on New Rule. You will be presented with a blank, untitled and un-configured rule. Enter the following values:

  • Name: bbc
  • Whitelist: bbc

Now, under URL Mappings click on Add Mapping, and add the following mapping:

bbc ==> http://www.bbc.com

Your completed rule should look like this:

AppWorks Gateway Proxy Rule Creation

Click on Save Rule to complete the creation of your rule.

Next, you will need to enable the rule:

Enable the Proxy Rule

That's it! You're now proxying the BBC.

Test the Proxy Rule

To test the proxy Rule, open a new browser window and navigate to the URL:

http://localhost:8080/bbc

You should now see the BBC home page being served up to you via your Proxy:

Test the Proxy Rule

The Simple Proxy Rule in detail

This proxy rule configured here is as simple as it can get, but suffers from a few problems as we'll soon discover. Let's examine how the rule is acted upon behind the scenes, and for that, we'll first take a look at the role of Java Regular Expressions in the AppWorks Gateway Proxy.

The AppWorks Proxy works by intercepting requests made to the AppWorks Gateway internal web server, and matching those requests against rules that are formulated in Java Regular Expressions.

In essence, a match is declared in the Java Regular Expression Language, which, when matched will trigger some outcome, usually with some substitution which also may contain elements of the regular expression language.

Internally, before the regular expression is applied to the incoming URL, the scheme, host, and port elements are removed. In this example, the incoming URL

http://localhost:8080/bbc

becomes:

bbc

In this simple case, when following the URL http://localhost:8080/bbc the single whitelist entry bbc matches the transformed URL - the whitelist test has been passed. So the transformed URL is passed next on to the mapping rules, where another match with the URL is found in this rule:

bbc ==> http://www.bbc.com

Substituting http://www.bbc.com for bbc, the AppWorks Gateway Proxy will open a connection to http://www.bbc.com and send the response back to your browser.

However, any URL with bbc in its path will be a match for this simple rule. Try the following in your browser:

http://localhost:8080/bbc123

The URL matches the whitelist because bbc is contained within the URL path, and the Mapping Rule turns:

bbc123

into

http://www.bbc.com123

Which is not a valid web address.

Likewise, consider this URL that also gets accepted by the whitelist:

http://localhost:8080/aabbccdd

The Mapping Rule then turns this:

aabbccdd

into:

aahttp://localhost:8080/cdd

Which is not even a valid Internet URL.

Let's make the rule a little more robust. Edit the rule, and for the whitelist change the current entry to

^bbc(/.*|$)

and for the single URL Mapping, change it to:

^bbc(/.*|$) ==> http://www.bbc.com$1

Change the BBC Proxy Rule

Hit Save Rule, and try out your new proxy rule. Here are some examples of a successful match and transformation:

  • http://localhost:8080/bbc becomes http://www.bbc.com
  • http://localhost:8080/bbc/news becomes http://www.bbc.com/news
  • http://localhost:8080/bbc/sport becomes http://www.bbc.com/sport

You get the idea.

Here are some URLs that won't match, and so won't be served by the Proxy:

  • http://localhost:8080/aabbccdd
  • http://localhost:8080/bbc123

What we've done here is to use the Java Regular Expression language to constrain what matches and what doesn't:

  • ^ ties the match to the start of the URL path
  • (/.*|$) means that the match (to bbc) succeeds only if it is followed by the / character and an indeterminate number of other characters (specified by /.*) or is followed by nothing at all (the end of the string - specified by $)
  • the parentheses (...) is a capture group. Whatever is captured within these parentheses can be carried over to the target URL in numbered placeholders. This is the first explicit capture group, and so has the number 1 and is referred to in the target URL as $1. This can be seen in the rule in the target URL http://www.bbc.com$1

Composing Regular Expressions

Sometimes it can get a bit tricky formulating the correct set of regular expressions for the desired Proxy configuration. Online tools exist that can take some of the guesswork out of regular expression composition. I find the Regular Expression Test Page for Java particularly useful.

An API Proxy Example

In general, you won't be Proxying whole websites with the AppWorks Gateway Proxy - it is really intended for proxying web services APIs. This second example walks you through configuring a Proxy rule set for such a web service.

The Basics

We're going to use a free, but fake, REST web service in the remainder of this tutorial. The fake service can be found here: https://jsonplaceholder.typicode.com/

One of the first decisions you should make, is which virtual path you should construct for your proxied API. Looking at the documentation for JSONPlaceholder the following root paths are defined:

  • /posts
  • /comments
  • /albums
  • /photos
  • /todos
  • /users

We could, if we wish, map these exact paths, so that URLs to the proxied API would look like this:

  • http://localhost:8080/posts
  • http://localhost:8080/comments
  • Etc

However, such an approach leads to poor manageability of the proxied API as well as the potential for path conflicts with other APIs and services exposed or hosted by the gateway. Instead, we'll use the root path /fora (for Fake Online REST API), so that the proxied API will look like this:

  • http://localhost:8080/fora/posts
  • http://localhost:8080/fora/comments
  • Etc

Create a new Proxy Rule with the following values:

  • Name: fora
  • Whitelist: ^fora/.*
  • URL Mapping: fora(.*) ==> http://jsonplaceholder.typicode.com$1

Create the Proxy Rule

Enable the rule and you're good to go.

Enable the Proxy Rule

We can now test the rule. In your browser, follow the URL http://localhost:8080/fora/posts. If all goes well, you should see something like this:

Follow the Proxy Rule

NetTool

Using a browser for testing APIs soon becomes troublesome. We need something with a little more horsepower that will allow us to construct more complex requests, and allow us to inspect both the traffic we generate and the traffic we receive. There are many tools to choose from. A good one is NetTool, which I'm going to use for the remainder of this article.

Restricting API Exposure with the Whitelist

Quite often, the solution you're crafting will use only a portion of the proxied API. It is good practice to restrict your proxy to only the essential components of the proxied API:

  • To reduce the potential amount of traffic that the proxied API could receive
  • By reducing the exposed API footprint, you reduce the potential for intrusion attacks
  • To prevent inadvertent data leakage, by cutting off non-required parts of the API that would otherwise serve up these data

Let's give NetTool a spin. Fire it up and have it send a GET request to http://localhost:8080/fora/comments

NetTool comments

Notice that we can easily see the success response status code 200, and the response body is laid out for us.

Now let's imagine that your solution does not use the comments section of the proxied API, and that furthermore you wish to remove access to it altogether. One way you might do this is to get creative with the Proxy Whitelist and make sure only the URLs you wish to expose are whitelisted. In general, this is a good approach, but sometimes it's a little difficult to achieve. Instead, you can introduce a blacklist entry, which we shall do now.

Edit the Proxy Rule and add the following Blacklist entry:

  • Match: ^fora/comments
  • HTTP Response Status: 404

Add Blacklist

Hit Save Rule and return to your NetTool window. In NetTool, resend the same request as before. Your request is now met with a 404 status response and a blank body - you have successfully blacklisted the comments REST resource.

Comments resource now not available through Proxy

Preventing Path Manipulation Attacks

The /comments path was blacklisted by specifying an exception to the whitelist as a URL pattern that should not be followed for incoming URLs that match it. However, as it stands it's a very easy restriction to bypass. For example, consider a GET request with the following URL:

http://localhost:8080/fora/posts/../comments

Try it out in NetTool:

Path Manipulation

You've circumvented the blacklist!

../ is a relative path that means 'go back up a level'. To prevent path manipulation attacks like this, you should always have ../ blacklisted.

Edit the Proxy Rule again, and add

  • Match: \.\./
  • HTTP Response Status: 403

Path Manipulation Counter-measure

The period (.) is a special symbol in regular expressions, so if it is to be taken literally it must be escaped with the \ character, like this \..

We specify a response status of 403 to indicate the proxy's refusal to honor the request.

Re-issue the request using NetTool and you should now see a 403 status response for any request that attempts to use relative path manipulation:

Path Manipulation Counter-measure complete

Content Rewriting

In many cases, it's useful to rewrite content that is outgoing from the Proxied API. For example, you may wish to replace internal IP addresses and DNS names with external ones. In this example, we're going to use content rewriting to prevent sensitive data leakage from the /users resource.

If you use NetTool to issue a GET request against http://localhost:8080/fora/users, you'll see a response that contains email addresses:

Data Leakage

You may not wish personal email addresses to cross the boundary of your organization like this. Email addresses can be used by attackers as valid login names for a brute force attack, or more subtly by social engineering which sees suspects persuaded to act on fake emails that appear genuine.

What we're going to do now is add an Outgoing Rule that will replace all outgoing email addresses from the /users resource with the single email address info@opentext.com.

Edit the Proxy Rule once again, and add the following Outgoing Rule:

"email"\s*:\s*".+?"\s*(,?) ==> "email":"info@opentext.com"$1

Data Leakage

Save the rule, and rerun the last request in NetTool:

Data Leakage Fixed

Notice that all email addresses in the response have now been replaced with info@opentext.com.

Debugging Your Proxy Rules

A good place to start is your Tomcat access logs. These are generally to be found under the TOMCAT_HOME/logs directory, and take the name localhost_access_log.<date>.txt. An example snippet is shown below:

127.0.0.1 - - [20/Dec/2016:13:45:01 -0500] "PUT /v3/admin/proxy/config/105 HTTP/1.1" 200 335
127.0.0.1 - - [20/Dec/2016:13:45:01 -0500] "GET /v3/admin/proxy/config HTTP/1.1" 200 1944
127.0.0.1 - - [20/Dec/2016:13:45:06 -0500] "GET /fora/posts/../comments HTTP/1.1" 403 -
127.0.0.1 - - [20/Dec/2016:13:55:53 -0500] "GET /fora/users HTTP/1.1" 200 5658
127.0.0.1 - - [20/Dec/2016:14:00:08 -0500] "GET /v3/admin/auth HTTP/1.1" 200 187
127.0.0.1 - - [20/Dec/2016:14:13:40 -0500] "PUT /v3/admin/proxy/config/110 HTTP/1.1" 200 476
127.0.0.1 - - [20/Dec/2016:14:13:40 -0500] "GET /v3/admin/proxy/config HTTP/1.1" 200 2085
127.0.0.1 - - [20/Dec/2016:14:13:46 -0500] "GET /fora/users HTTP/1.1" 200 5607

When you need to dig deeper, you can enable deeper logging in the AppWorks Gateway (as of version 16.1). To do this, first shutdown your tomcat, then edit the file

TOMCAT_HOME/gateway/WEB-INF/classes/log4j.properties

Set the root logger level to DEBUG, and uncomment the PROXY section:

# DEBUG level logging for the Gateway, we hide some of the Spring and Swagger stuff for clarity at the end
log4j.rootLogger=DEBUG, otagLog, stdout

{...}

#### PROXY ####
# To debug proxy whitelists and URL rewriting, uncomment the following entries
log4j.logger.com.opentext.otag.schema.proxy = TRACE
log4j.logger.com.opentext.otag.camel.proxy = TRACE
log4j.logger.org.apache.camel.component.http4 = DEBUG
log4j.logger.org.apache.camel.component.http.DefaultHttpBinding = TRACE

{...}

Restart Tomcat to enable the deeper logging. The Proxy debug entries will appear in the AppWorks Gateway log TOMCAT_HOME/logs/gateway.log. Here is an example snippet:

2016-12-20 14:27:23 TRACE ProxyRuleset:221 - Whitelist match: "fora/users" is matched by "^fora/.*" in ruleset "fora"
2016-12-20 14:27:23 TRACE AWGProxyWhitelist:154 - URI "/fora/users" matches whitelist entry in ruleset "fora"
{...}
2016-12-20 14:27:23 TRACE DefaultHttpBinding:129 - HTTP method GET
2016-12-20 14:27:23 TRACE DefaultHttpBinding:130 - HTTP query null
2016-12-20 14:27:23 TRACE DefaultHttpBinding:131 - HTTP url http://localhost:8080/proxy/fora/users
2016-12-20 14:27:23 TRACE DefaultHttpBinding:132 - HTTP uri /proxy/fora/users
2016-12-20 14:27:23 TRACE DefaultHttpBinding:133 - HTTP path /fora/users
2016-12-20 14:27:23 TRACE DefaultHttpBinding:134 - HTTP content-type null
2016-12-20 14:27:23 TRACE DefaultHttpBinding:209 - HTTP attachment javax.servlet.forward.request_uri = /fora/users
2016-12-20 14:27:23 TRACE DefaultHttpBinding:209 - HTTP attachment javax.servlet.forward.context_path = 
2016-12-20 14:27:23 TRACE DefaultHttpBinding:209 - HTTP attachment javax.servlet.forward.servlet_path = /fora/users
2016-12-20 14:27:23 TRACE DefaultHttpBinding:209 - HTTP attachment AWG_PROXY_CURRENT_RULESET = ProxyRuleset{id=116, name='fora', whitelist={[ProxyWhitelist{id=119, ruleset name='fora', whitelistValue='^fora/.*'}]}, urlMappings=[ProxyMapping{, match='fora(.*)', replace='http://jsonplaceholder.typicode.com$1', continueMatch=false, sortIndex=0}], blacklist={[ProxyBlacklistEntry{match='^fora/comments', responseCode=404, location='null'}, ProxyBlacklistEntry{match='\.\./', responseCode=403, location='null'}]}, outgoingRules=[OutgoingRule{continueMatch=false, sortIndex=0, match='"email"\s*:\s*".+?"\s*(,?)', replace='"email":"info@opentext.com"$1', scope=RESPONSE_BODY}], enabled=true, sortIndex=2}
2016-12-20 14:27:23 TRACE DefaultHttpBinding:209 - HTTP attachment AWG_REQUEST_CONTEXT = RequestContext{userId='null', userInfo=UserInfo{tokenInfo=null, attributes={}}, otagtoken='null'}
{...}
2016-12-20 14:27:23 TRACE ProxyRuleset:221 - Whitelist match: "fora/users" is matched by "^fora/.*" in ruleset "fora"
2016-12-20 14:27:23 DEBUG HttpHelper:441 - Using url rewrite to rewrite from url fora/users to http://jsonplaceholder.typicode.com/users -> http://jsonplaceholder.typicode.com/users
{...}
2016-12-20 14:27:23 DEBUG HttpProducer:156 - Executing http GET method: http://jsonplaceholder.typicode.com/users
2016-12-20 14:27:23 DEBUG HttpProducer:160 - Http responseCode: 200
{...}
2016-12-20 14:27:23 DEBUG DefaultHttpBinding:376 - Streaming response in chunked mode with buffer size 8192

About OpenText

OpenText is the leader in Enterprise Information Management, helping customers to create
a Digital-First World by simplifying, transforming, and accelerating their information
needs. Over 100,000 customers already use OpenText solutions, either on premises or in
our cloud. For more information about OpenText (NASDAQ: OTEX; TSX: OTC), please
visit: www.opentext.com.

About the Author

Pete Oliver is a long standing employee of OpenText, occupying the position of
Senior Software Architect. Pete has worked on various OpenText products
and platforms, including ECM Collaboration, OpenText Directory Services (OTDS),
AppWorks Developer, and more recently AppWorks Mobile.


Table of Contents

Your comment

To leave a comment, please sign in.