{"id":607,"date":"2022-01-06T17:46:12","date_gmt":"2022-01-06T17:46:12","guid":{"rendered":"https:\/\/ml-gis-service.com\/?p=607"},"modified":"2022-01-06T17:46:13","modified_gmt":"2022-01-06T17:46:13","slug":"data-science-leave-geopandas-and-create-beautiful-map-with-pygmt","status":"publish","type":"post","link":"https:\/\/ml-gis-service.com\/index.php\/2022\/01\/06\/data-science-leave-geopandas-and-create-beautiful-map-with-pygmt\/","title":{"rendered":"Data Science: Leave GeoPandas and Create Beautiful Map with pyGMT"},"content":{"rendered":"\n<p>The typical day of a scientist working with spatial data always ends with a map. It doesn\u2019t matter if those maps are created for reports, articles, or just for fun. Spatial information needs to be displayed on the map. Fortunately, Pythonistas have <em>GeoPandas<\/em>, and we can quickly show our maps, just like this one:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"289\" height=\"300\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-289x300.png\" alt=\"Breast Cancer Risk Map generated with GeoPandas\" class=\"wp-image-611\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-289x300.png 289w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-985x1024.png 985w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-768x799.png 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150.png 1129w\" sizes=\"auto, (max-width: 289px) 100vw, 289px\" \/><figcaption>Breast Cancer Risk Map generated with GeoPandas<\/figcaption><\/figure><\/div>\n\n\n\n<p>Ugh! So yes\u2026 This map is not exactly beautiful and not exactly clear. In summary, it\u2019s not exactly a map, maybe some kind of a colorful blot. Yes, it could be better, but it\u2019ll require a lot of tedious work! There is another option &#8211; with the Python package <em>pyGMT<\/em> and just a few lines of code, we can transform our ugly plot into a map that can be shared with other people:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2-1024x576.png\" alt=\"Image shows difference between two maps. One generated in GeoPandas and another in pyGMT.\" class=\"wp-image-609\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2-1024x576.png 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2-300x169.png 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2-768x432.png 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2-1536x864.png 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/Round-1-2.png 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p>We will learn how to do it in this article. It is divided into three sections:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Data and environment,<\/li><li>Data Exploration and Basic Views,<\/li><li>Map development.<\/li><\/ul>\n\n\n\n<p>What are the main takeaways from this article?<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>We will be able to show data on a clear and beautiful maps,<\/li><li>We will learn how to transform data for visualization and mapping.<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Data and Environment<\/h2>\n\n\n\n<p>We must be sure that we start from the same position. The one condition is to have a similar programming environment. I assume that you\u2019re familiar with the <em>conda<\/em> package management tool. If you don\u2019t know what it is, then <a href=\"https:\/\/docs.conda.io\/projects\/conda\/en\/latest\/user-guide\/install\/index.html\">check <em>conda<\/em> here<\/a> and install it in your OS. It doesn\u2019t matter which type of installation you choose (<em>Anaconda<\/em> or <em>Miniconda<\/em>). <\/p>\n\n\n\n<p>If our <em>conda<\/em> installation is successful, we see in terminal keyword <code>(base)<\/code> at the beginning of the line. We are ready to set up our Python environment. It can be done with one command:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"powershell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">conda create -n pygmt-env -c conda-forge pygmt numpy geopandas matplotlib notebook<\/pre>\n\n\n\n<p>This command has done two things:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>It created container <code>-n<\/code> named <code>pygmt-env<\/code>.<\/li><li>This container installed packages <code>pygmt, numpy, geopandas, matplotlib, notebook<\/code> within it. Conda downloaded those packages from the <code>-c<\/code> channel named <code>conda-forge<\/code>.<\/li><\/ul>\n\n\n\n<p>It is crucial to install those packages within the virtual environment. It could be <em>conda<\/em> or<em> Python VirtualEnv<\/em> because we install <code>GDAL<\/code> along spatial packages and it may corrupt some applications if installed in the system-wide environment. If you have installed QGIS or other GDAL-depended packages in your operating system, then they could be damaged. If you\u2019re not sure about the installation, write a comment below, and we will go through it in more detail.<\/p>\n\n\n\n<p>Hopefully, we have configured the environment without surprises. We can activate it and start our exercise. Go to the directory where you want to run your project, open the terminal in this folder and activate the environment with the command:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"powershell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">conda activate pygmt-env<\/pre>\n\n\n\n<p>If you see <code>(pygmt-env)<\/code> instead of <code>(base)<\/code> at the beginning of the terminal line, everything works fine. The last step is to download the exercise dataset. It is available here: <a href=\"https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01\">https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01<\/a>. It is a shapefile named <code>smoothed_output.shp<\/code>. You can download only data, without code notebook, from the archive <code>data.zip<\/code>. It is here:\u00a0<a href=\"https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01\/blob\/main\/data.zip\">https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01\/blob\/main\/data.zip<\/a>. To do it, click on the <code>Download<\/code> button in <em>Github<\/em> website. Move data into a directory where you have started your project. Then return to the terminal with activated <code>pygmt-env<\/code> and type:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"powershell\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">jupyter-notebook<\/pre>\n\n\n\n<p>Press Return (Enter), and we are ready to go! Create new notebook and follow the code along this tutorial. Each code part is a single cell within this notebook. If you feel overwhelmed you can check solution in the repository with data.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Data Exploration and Basic Views<\/h2>\n\n\n\n<p>In the first cell of notebook type:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">import pygmt\nimport geopandas as gpd\nimport numpy as np\n\nimport matplotlib.pyplot as plt\nplt.style.use(\"dark_background\")<\/pre>\n\n\n\n<p>What is the purpose of each package?<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code>pygmt<\/code> is a Python wrapper around <em>GMT<\/em> mapping tool \u2013 we will use it to create a beautiful map!<\/li><li><code>geopandas<\/code> is a core tool of a spatial analyst, and we can use it for spatial data exploration, transformation, basic analysis.<\/li><li><code>numpy<\/code> is a linear algebra package, but we will use it to transform our dataset.<\/li><li><code>matplotlib<\/code> is a core plotting library in Python. We won\u2019t use it for maps but histograms when we explore data.<\/li><\/ul>\n\n\n\n<p>The last line sets plotting styles in our notebook. It will affect figures plotted with <code>matplotlib<\/code> and the map plotted with <em>GeoPandas<\/em>.<\/p>\n\n\n\n<p>Press <strong>Shift+Enter<\/strong> and wait until the package&#8217;s content is loaded. (Loading and data processing in <em>Jupyter Notebook<\/em> is presented with an asterisk character on the left side of a cell). In the next cell we load our dataset. It represents the Breast Cancer Risk (<em><strong>Incidence Rate<\/strong><\/em> -> number of new cancer cases per year per 100,000 people in a region). If you want to know more about the dataset, <a href=\"https:\/\/github.com\/SimonMolinsky\/pyinterpolate-paper\/blob\/main\/paper-examples\/example-use-case\/joss%20paper%20example.ipynb\">you can read it here<\/a>. For this article, we will use only geometry and one column from the loaded table.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">data = gpd.read_file('smoothed_output.shp')\ndata.head()<\/pre>\n\n\n\n<p>Sample records from the <code>data<\/code> table are:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td><strong>id<\/strong><\/td><td><strong>reg.est<\/strong><\/td><td><strong>reg.err<\/strong><\/td><td><strong>rmse<\/strong><\/td><td><strong>geometry<\/strong><\/td><\/tr><tr><td>25019<\/td><td>0.1<\/td><td>7.6<\/td><td>18.5<\/td><td>POINT(2117322.312, 556124.507)<\/td><\/tr><tr><td>36121<\/td><td>5.4<\/td><td>6.8<\/td><td>27.8<\/td><td>POINT(1424501.989, 556124.507)<\/td><\/tr><\/tbody><\/table><figcaption>Sample records.<\/figcaption><\/figure>\n\n\n\n<p>We have five columns but we use only <code>reg.est<\/code> values and <code>geometry<\/code> coordinates. Before plotting we should take a look into <code>reg.est<\/code> statistics:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">data['reg.est'].describe()<\/pre>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>count<\/td><td>4960<\/td><\/tr><tr><td>mean<\/td><td>5.756108<\/td><\/tr><tr><td>std<\/td><td>9.794106<\/td><\/tr><tr><td>min<\/td><td>0<\/td><\/tr><tr><td>25%<\/td><td>0.943172<\/td><\/tr><tr><td>50%<\/td><td>2.729134<\/td><\/tr><tr><td>75%<\/td><td>6.517619<\/td><\/tr><tr><td>max<\/td><td>173.5745<\/td><\/tr><\/tbody><\/table><figcaption>Data statistics.<\/figcaption><\/figure>\n\n\n\n<p>We have 4960 records. The third quartile is about 6 (cases per 100,000 people per year for a given area) and the maximum value is 173, so we may expect multiple values around zero and a long tail. It is an example of the real-world phenomenon with the<em> Poisson Distribution <\/em>of rare events occurring independently over a specific area and time. We can plot a histogram of values and check if our assumptions over data distribution are correct:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">plt.figure(figsize=(15, 5))\nplt.hist(data['reg.est'], bins=30, alpha=0.85)\nplt.title('Histogram of the relative risk of having a breast cancer in Northeastern U.S.', size=20, pad=5)\nplt.xlabel('Relative Incidence Rate - Breast Cancer', size=12)\nplt.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"392\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist-1024x392.jpg\" alt=\"Histogram of cancer rates over a study extent. It has many values close to zero and a long tail with a few larger values.\" class=\"wp-image-616\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist-1024x392.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist-300x115.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist-768x294.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist-1536x587.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_hist.jpg 1841w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure><\/div>\n\n\n\n<p>As we see, most values oscillate around zero. It means that cancer is a relatively rare event &#8211; good for us. From a technical point of view, it is not a simple task to present values dispersed over this range. We will come back to this problem later.<\/p>\n\n\n\n<p>We know the general distribution of cancer rates; now, we will uncover the spatial distribution of those. <em>GeoPandas <\/em>has <code>.plot()<\/code> method to show spatial data. We style it slightly by removing margins and ticks and take a look into data:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"289\" height=\"300\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-289x300.png\" alt=\"Breast Cancer Risk Map generated with GeoPandas\" class=\"wp-image-611\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-289x300.png 289w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-985x1024.png 985w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150-768x799.png 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/initial_map_150.png 1129w\" sizes=\"auto, (max-width: 289px) 100vw, 289px\" \/><figcaption>Breast Cancer Risk Map generated with GeoPandas<\/figcaption><\/figure><\/div>\n\n\n\n<p>This figure may be informative, but it is ugly and rather hard to read for non-specialists. It is especially hard to find any patterns and hot spots in it. A few things could be done better, and we will make it better with <em>pyGMT<\/em>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Map Development<\/h2>\n\n\n\n<p>Working with spatial data has its rhythm. We are always required to do something with a projection at some step, and this time we can\u2019t avoid it too. We need to know our data <em>Coordinate Reference System <\/em>to set map bounds for the mapping. The map canvas has its projection, and we must unify the data projection with the canvas projection.\u00a0\u00a0We will draw our map on <em>Mercator<\/em> projection, where coordinates are represented by degrees. Let\u2019s start with the analysis of the data projection. We can check its bounds with the <code>.total_bounds<\/code> parameter of the <code>GeoDataFrame<\/code> object:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">bds = data.total_bounds\nprint(bds)<\/pre>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[1277277.6705919   196124.50679998 2255886.37686832 1241124.50679814]<\/pre>\n\n\n\n<p>As we see, it\u2019s a metric representation. Therefore, we need to transform our projection to Mercator projection. To be sure that data has a different CRS, we can use the <code>.crs<\/code> parameter:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">print(data.crs)<\/pre>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">&lt;Derived Projected CRS: PROJCS[\"Lambert_Conformal_Conic\",GEOGCS[\"NAD83\",DA ...>\nName: Lambert_Conformal_Conic\nAxis Info [cartesian]:\n- [east]: Easting (metre)\n- [north]: Northing (metre)\nArea of Use:\n- undefined\nCoordinate Operation:\n- name: unnamed\n- method: Lambert Conic Conformal (2SP)\nDatum: North American Datum 1983\n- Ellipsoid: GRS 1980\n- Prime Meridian: Greenwich<\/pre>\n\n\n\n<p>Yes, it is a projection that doesn\u2019t fit our map. Lambert Conformal Conic projection is used to present maps of mid-latitudes of the United States, and it was used with this dataset to analyze variograms for the Kriging interpolation.<\/p>\n\n\n\n<p>Transformation of CRS in <em>GeoPandas<\/em> is very simple. We use method <code>.to_crs(crs)<\/code> for it:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">data2 = data.to_crs(crs='merc')\n\nbds = data2.total_bounds\n\nregion = [\n    bds[0] - 0.1,\n    bds[2] + 0.1,\n    bds[1] - 0.1,\n    bds[3] + 0.1,\n]\n\nprint(region)<\/pre>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">[-80.61423127996099, -66.88249508664613, 38.856652763089464, 47.516422721628444]<\/pre>\n\n\n\n<p>Now our extent is valid. We\u2019ve made it slightly larger to create a margin from the extreme points to the canvas frame. With an analysis region set, we can start mapping with <em>pyGMT<\/em>. First, we plot clean canvas:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">fig = pygmt.Figure()\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"black\", water=\"skyblue\")\nfig.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"253\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas-300x253.jpg\" alt=\"The map canvas. It shows region of North-Eastern United States.\" class=\"wp-image-618\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas-300x253.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas-1024x864.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas-768x648.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas-1536x1296.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/clean_canvas.jpg 1911w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption>The map canvas.<\/figcaption><\/figure><\/div>\n\n\n\n<p>Even the clean canvas is pretty! What happened? In the first line, we\u2019ve set up a figure, then we\u2019ve drawn a canvas with the <code>fig.basemap<\/code> over a specific extent given by the <code>region<\/code> parameter and the projection. Parameter <code>frame<\/code> sets a pretty frame around a map. The method <code>fig.coast<\/code> draws land and water, and we can select the color of those features.\u00a0\u00a0When our baseline map is done, we can draw points on it:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">fig = pygmt.Figure()\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"black\", water=\"skyblue\")\nfig.plot(x=data2.geometry.x, y=data2.geometry.y, style=\"c0.05c\", color=\"white\")\nfig.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"253\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas-300x253.jpg\" alt=\"The map that shows North-Eastern part of the US and the sampling grid as a points on map.\" class=\"wp-image-619\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas-300x253.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas-1024x864.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas-768x648.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas-1536x1296.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_canvas.jpg 1911w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption>The canvas with the sampling points.<\/figcaption><\/figure><\/div>\n\n\n\n<p>We set a new layer on our map with points representing the Breast Cancer Incidence Rate sampling areas. Method <code>.plot()<\/code> takes multiple arguments:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><code>x<\/code> is a longitude,<\/li><li><code>y<\/code> is a latitude,<\/li><li><code>style<\/code> parameter was set to: <strong>c<\/strong>ircle of <strong>0.05 c<\/strong>entimeter size,<\/li><li><code>color<\/code> &#8211; points are white.<\/li><\/ul>\n\n\n\n<p>This figure is not very informative, and we haven\u2019t included any information about the Breast Cancer incidence rates. How to do it? We will work with a plot style. The <code>size<\/code> of circles should be dynamic and depend on the incidence rate; color should bring some information too. That\u2019s why we make a colormap from our data and plot a color bar.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Set size and color of points\n\nds = data2['reg.est']\nfig = pygmt.Figure()\npygmt.makecpt(cmap=\"imola\", series=[ds.min(), ds.max()])\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"black\", water=\"skyblue\")\nfig.plot(\n    x=data2.geometry.x,\n    y=data2.geometry.y,\n    size=ds \/ 200,\n    style=\"cc\",\n    color=ds,\n    cmap=True,\n    pen=\"black\",\n)\nfig.colorbar(frame='af+l\"Breast Cancer Risk Map\"')\nfig.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"291\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation-300x291.jpg\" alt=\"The Breast Cancer Risk Map with styled sampling points.\" class=\"wp-image-621\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation-300x291.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation-1024x992.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation-768x744.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation-1536x1489.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/point_representation.jpg 1911w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption>The Breast Cancer Risk Map with styled sampling points.<\/figcaption><\/figure><\/div>\n\n\n\n<p>As you see, we have included new methods and new parameters:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>We\u2019ve created a series of values <code>ds<\/code> from the <code>reg.est<\/code> column. It simplifies code in the following lines (we can always write <code>data2[\u2018reg.est\u2019]<\/code>, but it will be cumbersome).<\/li><li>In the next line, we create a figure and then a colormap to show our data with a different color in relation to the incidence rate value. We use the <strong>imola<\/strong> colormap. <strong>It is not a random color map<\/strong> &#8211; we have chosen it with a purpose. It\u2019s a scientific color map that fairly represents the data, is readable by color-vision and color-blind people and it\u2019s reproducible. More about it you can read here:\u00a0<a href=\"https:\/\/www.fabiocrameri.ch\/colourmaps\/\">https:\/\/www.fabiocrameri.ch\/colourmaps\/<\/a><\/li><li>We tune some parameters from the <code>plot<\/code> method. Now size and color of a circle are dynamic, and we set the <code>cmap<\/code> parameter to <code>True<\/code> to be sure that points are colored. Additionally, we have added a black border to each circle with the <code>pen<\/code> parameter.<\/li><li>The color bar is the last, new layer of our map. We added a title too.<\/li><\/ul>\n\n\n\n<p>This map could be the last. It is better than the map generated with <em>GeoPandas<\/em>. But we can still work on it, to make it prettier. First, we need to transform data and make it more uniform. We can do it with a logarithmic transform. Then we should take a look into a histogram of changed data.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">transformed = np.log(ds + 1)\n\nplt.figure(figsize=(15, 5))\nplt.hist(transformed, bins=30, alpha=0.85)\nplt.title('Histogram of the breast cancer incidence rate in Northeastern U.S.', size=20, pad=5)\nplt.xlabel('Incidence Rate - Breast Cancer', size=12)\nplt.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"394\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1-1024x394.jpg\" alt=\"Transformed histogram of the breast cancer incidence rates.\" class=\"wp-image-622\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1-1024x394.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1-300x116.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1-768x296.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1-1536x592.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_1.jpg 1828w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Transformed histogram of the breast cancer incidence rates.<\/figcaption><\/figure><\/div>\n\n\n\n<p>This data has a lot of values that are very close to zero, and those are not so informative for us. That\u2019s why we will remove rates below 0.1 cases per 100\u00a0000 people per year from our data, and the histogram will be much more uniform. (<strong>Be careful! We can do it for this task; if we consider cancer we are interested in hot-spots and large values, but for other cases you should always consider if it is safe to remove data from the visualization!<\/strong>)<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Remove zeros\n\ntransformed = np.log(ds + 1)\ntransformed[transformed &lt; 0.1] = np.nan\n\nplt.figure(figsize=(15, 5))\nplt.hist(transformed, bins=30, alpha=0.85)\nplt.title('Histogram of the breast cancer incidence rate in Northeastern U.S.', size=20, pad=5)\nplt.xlabel('Incidence Rate - Breast Cancer', size=12)\nplt.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"394\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2-1024x394.jpg\" alt=\"Transformed histogram of the Breast Cancer Incidence Rates. Zeros and near-zero values are removed.\" class=\"wp-image-623\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2-1024x394.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2-300x116.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2-768x296.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2-1536x592.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/transformed_hist_2.jpg 1828w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Transformed histogram of the Breast Cancer Incidence Rates. Zeros and near-zero values are removed.<\/figcaption><\/figure><\/div>\n\n\n\n<p>How does our data look now? Let\u2019s check the map!<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\">fig = pygmt.Figure()\npygmt.makecpt(cmap=\"imola\", series=[transformed.min(), transformed.max()])\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"black\", water=\"skyblue\")\nfig.plot(\n    x=data2.geometry.x,\n    y=data2.geometry.y,\n    size=transformed \/ 20,\n    style=\"cc\",\n    color=transformed,\n    cmap=True,\n    pen=\"black\",\n)\nfig.colorbar(frame='af+l\"Breast Cancer Risk Map - natural log scale\"')\nfig.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"291\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform-300x291.jpg\" alt=\"The breast cancer incidence rates map after log transform of a data.\" class=\"wp-image-624\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform-300x291.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform-1024x992.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform-768x744.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform-1536x1489.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_transform.jpg 1911w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption>The breast cancer incidence rates map after log transform of a data.<\/figcaption><\/figure><\/div>\n\n\n\n<p>It is much cleaner! Now spatial patterns and clustering are more visible. But it could be better! We\u2019ve transformed input data, so now we will transform the canvas. We will change its colors and add country borders. Additionally, we will add some transparency to the plotted layer:<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Work with the style\n\nfig = pygmt.Figure()\npygmt.makecpt(cmap=\"imola\", series=[transformed.min(), transformed.max()])\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"#F5F5F5\", water=\"#82CAFA\", borders='1\/1p')\nfig.plot(\n    x=data2.geometry.x,\n    y=data2.geometry.y,\n    size=transformed \/ 20,\n    style=\"cc\",\n    color=transformed,\n    cmap=True,\n    pen=\"black\",\n    transparency=15\n)\nfig.colorbar(frame='af+l\"Breast Cancer Risk Map - natural log scale\"')\nfig.show()<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-medium\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"291\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling-300x291.jpg\" alt=\"The breast cancer incidence rates map after log transform and styling.\" class=\"wp-image-625\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling-300x291.jpg 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling-1024x992.jpg 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling-768x744.jpg 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling-1536x1489.jpg 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/canvas_after_styling.jpg 1911w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><figcaption>The breast cancer incidence rates map after log transform and styling.<\/figcaption><\/figure><\/div>\n\n\n\n<p>We are close to the end! The last thing: we add text layers to map more informative. We add Canada and the Atlantic Ocean names to the map with the <code>text()<\/code> method and layer.<\/p>\n\n\n\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"python\" data-enlighter-theme=\"\" data-enlighter-highlight=\"\" data-enlighter-linenumbers=\"\" data-enlighter-lineoffset=\"\" data-enlighter-title=\"\" data-enlighter-group=\"\"># Add text\n\nfig = pygmt.Figure()\npygmt.makecpt(cmap=\"imola\", series=[transformed.min(), transformed.max()])\nfig.basemap(region=region, projection=\"M15c\", frame=True)\nfig.coast(land=\"#F5F5F5\", water=\"#82CAFA\", borders='1\/1p')\nfig.text(text=\"CANADA\", x=-76.5, y=46, font=\"18p,Helvetica-Bold\")\nfig.text(text=\"ATLANTIC OCEAN\", x=-70, y=40)\nfig.plot(\n    x=data2.geometry.x,\n    y=data2.geometry.y,\n    size=transformed \/ 20,\n    style=\"cc\",\n    color=transformed,\n    cmap=True,\n    pen=\"black\",\n    transparency=15\n)\nfig.colorbar(frame='af+l\"Breast Cancer Risk Map - natural log scale\"')\nfig.show()\n<\/pre>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"992\" src=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig-1024x992.png\" alt=\"The breast cancer incidence rate distribution map for the North-Eastern part of the U.S.\" class=\"wp-image-626\" srcset=\"https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig-1024x992.png 1024w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig-300x291.png 300w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig-768x744.png 768w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig-1536x1489.png 1536w, https:\/\/ml-gis-service.com\/wp-content\/uploads\/2022\/01\/output_fig.png 1911w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>The breast cancer incidence rate distribution map for the North-Eastern part of the U.S.<\/figcaption><\/figure><\/div>\n\n\n\n<p>We\u2019ve made it! The final figure is much more readable, and at this stage, we could include it in the report or presentation. If you compare it with the initial map, you\u2019ll see how much we have gained due to the <em>pyGMT<\/em>. Isn\u2019t it great?<\/p>\n\n\n\n<p><strong>Happy Mapping in 2022!<\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Resources<\/h2>\n\n\n\n<ol class=\"wp-block-list\" type=\"1\"><li><strong>pyGMT<\/strong>:\u00a0<a href=\"https:\/\/www.pygmt.org\/latest\/index.html\">https:\/\/www.pygmt.org\/latest\/index.html<\/a><\/li><li><strong>Color Maps<\/strong>:\u00a0<a href=\"https:\/\/www.fabiocrameri.ch\/colourmaps\/\">https:\/\/www.fabiocrameri.ch\/colourmaps\/<\/a><\/li><li><strong>Repository<\/strong>:\u00a0<a href=\"https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01\">https:\/\/github.com\/SimonMolinsky\/pygmt-article-2021-01<\/a><\/li><\/ol>\n","protected":false},"excerpt":{"rendered":"<p>How to create a beautiful map with Python and pyGMT<\/p>\n","protected":false},"author":1,"featured_media":609,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[138,18,137,3,49,30],"tags":[145,139,47,143,144,57,142,146,147,11,140,149,141,7,148,62],"class_list":["post-607","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cartography","category-data-science","category-data-visualization","category-python","category-rd","category-spatial-statistics","tag-beautiful-maps","tag-cartography","tag-data-science","tag-data-visualization","tag-dataviz","tag-geopandas","tag-gmt","tag-how-to-create-map","tag-map-for-report","tag-matplotlib","tag-plot-map","tag-publication","tag-pygmt","tag-python","tag-report","tag-spatial"],"_links":{"self":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/607","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/comments?post=607"}],"version-history":[{"count":15,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/607\/revisions"}],"predecessor-version":[{"id":633,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/posts\/607\/revisions\/633"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/media\/609"}],"wp:attachment":[{"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/media?parent=607"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/categories?post=607"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ml-gis-service.com\/index.php\/wp-json\/wp\/v2\/tags?post=607"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}