rhkpy package


rhkpy.rhkpy_loader module


Load the data from the .sm4 file using the old loader from spym


Load the data from the .sm4 file using spym

class rhkpy.rhkpy_loader.rhkdata(filename, repetitions=0, alternate=True, loadraw=False, **kwargs)[source]

Bases: object

A container for the xarray based structure of the RHK data. Loads the RHK “sm4” file from the path at: filename.

  • filename (str) – path and filename of the “sm4” file to be loaded

  • repetitions (int, optional) – The number of repeated aquisitions of spectra in tip position, defaults to 0

  • alternate (bool, optional) – True if the bias is swept forward and backward, False if not, defaults to True

  • loadraw (bool, optional) – Set to True if you want the raw topography data, defaults to False

Some variables of the rhkdata class:

  • filename – (type str) filename of the “sm4” file

  • image – (type xarray Dataset) Dataset containing the image data

  • spectra – (type xarray Dataset) Dataset containing the spectroscopy data

  • spymdata – (type spym instance) Dataset, as loaded by the spym module

All the variables can be listed by calling: rhkdata.print_info.


If you want to skip the “flatten” filter of the topography images, use: loadraw = True.

import rhkpy

# Load dI/dV spectra, measured along a line
filename = 'linespectra.sm4'
linespec = rhkpy.rhkdata(filename)

# display the contents of the spectroscopy `xarray` instance
Dimensions:      (bias: 501, dist: 64, repetitions: 1, biasscandir: 2)
* bias         (bias) float64 0.5 0.498 0.496 0.494 ... -0.496 -0.498 -0.5
* dist         (dist) float64 0.0 0.5279 1.056 1.584 ... 32.2 32.73 33.26
* repetitions  (repetitions) int32 0
* biasscandir  (biasscandir) <U5 'left' 'right'
Data variables:
        lia          (bias, dist, repetitions, biasscandir) float64 3.585 ... 5.185
        current      (bias, dist, repetitions, biasscandir) float64 99.49 ... -132.7
        x            (dist) float64 -37.48 -36.97 -36.45 ... -5.902 -5.384 -4.866
        y            (dist) float64 -173.5 -173.6 -173.7 ... -179.7 -179.9 -180.0
Attributes: (12/15)
        filename:           line_9K_ABC6_2020_11_01_12_12_27_213.sm4
        bias:               0.49999988
        bias units:         V
        setpoint:           99.99999439624929
        setpoint units:     pA
        measurement date:   11/01/20
        ...                 ...
        LI amplitude unit:  mV
        LI frequency:       1300.0
        LI frequency unit:  Hz
        LI phase:           -102.9999998
        datatype:           line
        spectype:           iv

# select the dI/dV signal (lia) and average the
# forward and backward bias sweeps and repetitions
linespec_avg = linespec.spectra.lia.mean(dim = ['biasscandir', 'repetitions'])

# plot the dI/dV values along the remaining coordinates: bias, dist

Returns a new rhkdata instance, with the coordinates updated to reflect the abolute tip position. This includes X, Y offset and rotation.


rhkdata instance, with the same data and metadata, but the rhkdata.image, xarray variable coordinates shifted to absolute tip positions.

Return type:

rhkdata instance


import rhkpy

m = rhkpy.rhkdata('didv map.sm4')

# Take the `rhkdata` instance (image or map): `m`,
# and convert the image coordinates to absolute values
m_abs = m.coord_to_absolute()

# coordinates of the instance `m`
# We can see it runs from 0 to 100 nm
print(m.image.x.min().data, m.image.x.max().data)
0.0 100.0

# check the same corrdinate for the new `m_abs`
print(m_abs.image.x.min().data, m_abs.image.x.max().data)
-877.0008892433623 -741.0633876547834

# we can see it's now shows the exact tip position
# the image is also rotated, as the "scan angle" attribute shows
m_abs.image.attrs['scan angle']

# plot the rotated and offset image
m_abs.image.topography.sel(scandir = 'forward').plot()

Returns a new instance of rhkdata, with the data averaged (using xarray.Variable.mean) along the ‘repetitions’ coordinate. Meant to be shorthand for: rhkdata_instance.spectra.mean(dim = 'repetitions').


rhkdata instance

Return type:



Returns a new instance of rhkdata, with the data averaged (using xarray.Variable.mean) along the ‘biasscandir’ or ‘zscandir’ coordinate. Meant to be shorthand for: rhkdata_instance.spectra.mean(dim = 'biasscandir'), or rhkdata_instance.spectra.mean(dim = 'zscandir').


rhkdata instance

Return type:



Uses polyflatten() to flatten the selected datafield in the rhkdata instance. All keywords accepted by polyflatten() can be passed.


rhdata instance, with the selected field_type flattened. Default field_type = ‘topography’.

Return type:

rhdata instance


List the variables of the rhkdata instance.

qplot(width=None, **kwargs)[source]

Quick plot of the rhkdata instance


width (float, optional) – set size of plot, defaults to None

The colorscales used for density plots can be specified by the keywords below. For possible colorscale options see the HoloViews colormaps.

  • cmap_topo (str, optional) – topography colorscale, defaults to ‘fire’

  • cmap_spec (str, optional) – colorscale for plotting dI/dV data, defaults to ‘viridis’


panel plot

Return type:


rhkpy.rhkpy_process module

rhkpy.rhkpy_process.bgsubtract(x_data, y_data, polyorder=1, toplot=False, fitmask=None, hmin=0.5, hmax=10000, wmin=1.5, wmax=20, prom=2, exclusion_factor=3, peak_pos=None)[source]

Takes the x_data and y_data and automatically finds peaks, using scipy.find_peaks. These peaks are then used to define the areas of the background signal (y_data). In the areas with the peaks removed, the background is fitted by a polynomial of order given by the optional argument: polyorder. The fit is performed by scipy.optimize.curve_fit. The function returns the y_data values with the background removed, the background polinomial values themselves and the coefficients of the background fit results, as used by numpy.polyval.

In cases, where the automatic peak find is not functioning as expected, one can pass the values in x_data, at which peaks appear. In this case, the wmin option determines the width of all peaks.

If a fitmask is supplied for fitting, the fitmask is not calculated and only a polynomial fit is performed. This can decrease the runtime.

  • x_data (numpy array) – variable of the data (typically bias voltage)

  • y_data (numpy array) – data values (typically dI/dV)

  • polyorder (int, optional) – order of polynomial used to fit the background, defaults to 3

  • toplot (bool, optional) – if True a plot of: the fit, the background used and positions of the peaks is shown, defaults to False

  • fitmask (numpy array) – Fitmask to be used for polynomial fitting.

  • hmin (float, optional) – minimum height of the peaks passed to scipy.signal.find_peaks, defaults to 50

  • hmax (float, optional) – maximum height of the peaks passed to scipy.signal.find_peaks, defaults to 10000

  • wmin (float, optional) – minimum width of the peaks, passed to scipy.signal.find_peaks, defaults to 4

  • wmax (float, optional) – maximum width of the peaks passed to scipy.signal.find_peaks, defaults to 60

  • prom (float, optional) – prominence of the peaks, passed to scipy.signal.find_peaks, defaults to 10

  • exclusion_factor (float, optional) – this parameter multiplies the width of the peaks found by scipy.signal.find_peaks, or specified by wmin if the peak positions are passed by hand, using peak_pos, defaults to 6

  • peak_pos (list of floats, optional) – list of the peak positions in x_data values used for exclusion, defaults to None


y_data_nobg, bg_values, coeff, params_used_at_run, mask, covar

Return type:

tuple: (numpy array, numpy array, numpy array, dictionary, numpy array, numpy array)

  • y_data_nobg: data values, with the background subtracted,

  • bg_values: the polynomial values of the fit, at the x_data positions,

  • coeff: coefficients of the polynomial fit, as used by: numpy.polyval,

  • params_used_at_run: parameters used at runtime

  • mask: the calculated fitmask

  • covar: covariance of the fit parameters


Using the option: peak_pos, a wmin*exclusion_factor/2 region (measured in datapoints) on both sides of the peaks is excluded from the background fit. If automatic peak finding is used, the exclusion area is calculated in a similar way, but the width of the individual peaks are used, as determined by scipy.signal.find_peaks.


Takes as input the rhkdata.image variable of an rhkdata instance. Returns a new xarray instance, with the coordinates updated to reflect the abolute tip position. This includes X, Y offset and rotation.


xrobj (xarray Dataset) – xarray image variable of an rhkdata object


xarray rhkdata.image instance, with the same data and metadata as the input and the coordinates shifted to absolute tip positions.

Return type:

xarray Dataset

import rhkpy

m = rhkpy.rhkdata('didv map.sm4')

# Take the `rhkdata` instance (image or map): `m`,
# and convert the image coordinates to absolute values
m_abs_image = rhkpy.coord_to_absolute(m.image)

# coordinates of the instance `m`
# We can see it runs from 0 to 100 nm
print(m.image.x.min().data, m.image.x.max().data)
0.0 100.0

# check the same corrdinate for the new `m_abs`
print(m_abs_image.x.min().data, m_abs_image.x.max().data)
-877.0008892433623 -741.0633876547834

# we can see it's now shows the exact tip position
# the image is also rotated, as the "scan angle" attribute shows
m_abs_image.attrs['scan angle']

# plot the rotated and offset image
m_abs_image.topography.sel(scandir = 'forward').plot()
rhkpy.rhkpy_process.gaussian(x, x0=0, ampl=2, width=0.05, offset=0)[source]

Gaussian function. Width and amplitude parameters have the same meaning as for lorentz().

  • x (float, numpy array, etc.) – values for the x coordinate

  • x0 (float) – shift along the x corrdinate

  • ampl (float) – amplitude of the peak

  • width (float) – FWHM of the peak

  • offset (float) – offset along the function value


values of a Gaussian function

Return type:

float, numpy array, etc.

rhkpy.rhkpy_process.gaussian2(x, x01=-5, ampl1=1, width1=0.05, x02=5, ampl2=1, width2=0.05, offset=0)[source]

Double Gaussian function

  • x (float, numpy array, etc.) – values for the x coordinate

  • x01 (float, optional) – position of the peak, defaults to -5

  • ampl1 (float, optional) – amplitude of the peak, defaults to 1

  • width1 (float, optional) – width of the peak, defaults to 10

  • x02 (float, optional) – position of the peak, defaults to 5

  • ampl2 (float, optional) – amplitude of the peak, defaults to 1

  • width2 (float, optional) – width of the peak, defaults to 10

  • offset (float, optional) – offset, defaults to 0


values of a double Gaussian function

Return type:

float, numpy array, etc.

rhkpy.rhkpy_process.genthumbs(folderpath='./', **kwargs)[source]

Generate thumbnails for the sm4 files present in the current folder (usually the folder where the jupyter notebook is present). It folderpath is specified it generates the thumbnails in the path given. All other files are ignored. Subfolders are ignored. The method uses qplot() to make the png images.


folderpath (str, optional) – path to the folder containing the sm4 files, defaults to ‘./’

import rhkpy

# generate thumbnails of the sm4 files in the current working directory

# generate thumbnails for the folder "stm measurements/maps"
rhkpy.genthumbs(folderpath = './stm measurements/maps/')


Possible options for folderpath are:

  • relative path: “./” means the current directory. “../” is one directory above the current one.

  • absolute path: Can start with: “c:/users/averagejoe/data”

If you use backslashes to separate folder names, remember to append “r” to the beginning of the path to escape backslashes. For example: folderpath = r"c:\users\averagejoe\data". Paths can be copied directly from Windows explorer, if you append an “r”.

rhkpy.rhkpy_process.lorentz(x, x0=0, ampl=2, width=0.05, offset=0)[source]

Single Lorentz function

  • x (float, numpy array, etc.) – values for the x coordinate

  • x0 (float) – x corrdinate

  • ampl (float) – amplitude of the peak

  • width (float) – FWHM of the peak

  • offset (float) – offset along the function value


values of a single Lorentz function

Return type:

float, numpy array, etc.


The area of the peak can be given by:

area = np.pi * amplitude * width / 2
rhkpy.rhkpy_process.mapsection(specmap, start_point, end_point)[source]

Makes a section across a dI/dV spectroscopy map: specmap. Starting and end points: start_point to end_point. It uses xarray.Dataset.interp to interpolate between data values.

  • specmap (xarray DataSet) – the spectra xarray variable of an rhkdata instance. Found under: rhkpy.rhkpy_loader.rhkdata.spectra.

  • start_point (tuple: (float, float)) – starting point for the line section. In the format: (x, y), found in the specpos_x, specpos_y coordinates of specmap.

  • end_point (tuple: (float, float)) – end point for the line section. In the format: (x, y), found in the specpos_x, specpos_y coordinates of specmap.


xarray DataSet of the line section

Return type:

xarray DataSet

rhkpy.rhkpy_process.navigation(*args, **kwargs)[source]

Takes any number of rhkdata arguments: ‘map’, ‘line’, ‘spec’ and plots all of them on a single plot. Plotting of the spectroscopy positions can be skipped by setting the optional keyword: plot_spec to False. Plotting is done in the order of passing of the arguments. First argument will be plotted first.

The color map used for plotting topography images can be specified by the cmap optional keyword argument. Default value is ‘bone’.

Colors for use in the labels can be specified by the optional keyword: palette_name. If this is used, the number of colors also needs to be specified, by: num_colors. For possible palette options look at the bokeh palettes.


holoviews plot

Return type:


import rhkpy

# Load some data
didvmap = rhkpy.rhkdata('didvmap path/map.sm4')
topography = rhkpy.rhkdata('topo path/topo1.sm4')
single_spec = rhkpy.rhkdata('single spec path/single spec.sm4')

# plot the topography and spectroscopy positions
rhkpy.navigation(topography, didvmap, single_spec)

# skip plotting the spectroscopy positions
rhkpy.navigation(topography, didvmap, single_spec, plot_spec = False)

# In the above examples, the image from topography
# is plotted before the image data of didivmap!
# You can change the plotting order by changing
# the order of the `rhkdata` instances in the arguments.
rhkpy.navigation(didvmap, topography, single_spec) # now didvmap is plotted first


Arguments are plotted in the order they are passed to navigation().

The spectroscopy positions in dI/dV maps can be simply visualized by just passing the map rhkdata instance.

rhkpy.rhkpy_process.peakfit(xrobj, func=<function gaussian>, fitresult=None, stval=None, bounds=None, toplot=False, pos_x=None, pos_y=None, **kwargs)[source]

Fitting a function to peaks in the data contained in the xrobj DataArray. Currently, Datasets with multiple DataArrays is not supported. Peak fitting is always assumed to be along the bias coordinate.

  • xrobj (xarray) – xarray DataArray, of a single spectrum or a map.

  • func (function, optional) – function to be used for fitting, defaults to gaussian

  • fitresult (xarray Dataset, optional) – an xarray Dataset of a previous fit calculation, with matching dimensions. If this is passed to peakfit(), the fit calculation in skipped and the passed Dataset is used.

  • stval (dictionary of func parameters, optional) – starting values for the fit parameters of func. You are free to specify only some of the values, the rest will be filled by defaults. Defaults are given in the starting values for keyword arguments in func.

  • bounds (dictionary of func parameters, with tuples containing lower and upper bounds, optional) – bounds for the fit parameters, used by xarray.curvefit. Simlar dictionary, like stval, but the values area a list, with lower and upper components. Defaults to None

  • toplot (boolean, optional) – plot the fit result, defaults to False

  • pos_x (int or float, optional) – pos_x parameter of an xarray map to be used in conjunction with toplot = True

  • pos_y (int or float, optional) – pos_y parameter of an xarray map to be used in conjunction with toplot = True


fitted parameters of func and covariances in a Dataset

Return type:

xarray Dataset


import rhkpy

# example coming soon


  • Use toplot = True to tweak the starting values. If toplot = True, in case of a map, if no pos_x and pos_y are specified, the middle of the map is used for plotting.

  • Passing a bounds dictionary to peakfit() seems to increase the fitting time significantly. This might be an issue with xarray.DataArray.curvefit.

  • By passing a previous fit result, using the optional parameter fitresult, we can just plot the fit result at multiple regions of the map.

See also

It is good practice, to crop the data to the vicinity of the peak you want to fit to.

rhkpy.rhkpy_process.polyflatten(xrobj, field_type='topography', **kwargs)[source]

Fits a polynomial to the fast scan lines of topography data and subtracts it from the lines.

The keyword argument polyorder works the same way as in bgsubtract(). Keywords used by bgsubtract() can be passed.

Still needs testing.

  • xrobj (xarray Dataset, rhkdata.image) – xarray image variable of an rhkdata object

  • field_type (str, optional) – select the DataArray: ‘topography’, ‘current’ or ‘lia’, defaults to ‘topography’


New rhkdata.image Dataset of rhkdata, with the DataArray specifiec by field_type flattened.

Return type:

xarray Dataset

rhkpy.rhkpy_process.polynomial_fit(order, x_data, y_data)[source]

Polinomial fit to x_data, y_data

  • order (int) – order of the polinomial to be fit

  • x_data (numpy array) – x coordinate of the data

  • y_data (numpy array) – y coordinate of the data


coefficients of the polinomial coeff, as used by numpy.polyval, covariance matrix covar, as returned by scipy.optimize.curve_fit

Return type:

tuple: (numpy array, numpy array)

Module contents