Class category

Nested Relationships

Nested Types

Class Documentation

class category

The class category is a sequence container for rows of data values. You could think of it as a std::vector<cif::row_handle> like class.

A category_validator can be assigned to an object of category after which this class can validate contained data and use an index to keep key values unique.

Public Types

using key_type = row_initializer

The key type.

Public Functions

category() = default

Default constructor.

category(std::string_view name)

Constructor taking a name.

category(const category &rhs)

Copy constructor.

category(category &&rhs)

Move constructor.

category &operator=(const category &rhs)

Copy assignement operator.

category &operator=(category &&rhs)

Move assignement operator.

~category()

Destructor.

Note

Please note that the destructor is not virtual. It is assumed that you will not derive from this class.

inline const std::string &name() const

Returns the name of the category.

iset key_fields() const

Returns the cif::iset of key field names. Retrieved from the category_validator for this category.

std::set<uint16_t> key_field_indices() const

Returns a set of indices for the key fields.

void set_validator(const validator *v, datablock &db)

Set the validator for this category to v.

Parameters
void update_links(datablock &db)

Update the links in this category.

Parameters

db – The enclosing datablock

inline const validator *get_validator() const

Return the global validator for the data.

Returns

The validator or nullptr if not assigned

inline const category_validator *get_cat_validator() const

Return the category validator for this category.

Returns

The category_validator or nullptr if not assigned

bool is_valid() const

Validate the data stored using the assigned category_validator.

Returns

Returns true is all validations pass

bool validate_links() const

Validate links, that means, values in this category should have an accompanying value in parent categories.

Note

The code makes one exception when validating missing links and that’s between atom_site and a parent pdbx_poly_seq_scheme or entity_poly_seq. This particular case should be skipped because it is wrong: there are atoms that are not part of a polymer, and thus will have no parent in those categories.

Returns

Returns true is all validations pass

bool operator==(const category &rhs) const

Equality operator, returns true if rhs is equal to this.

Parameters

rhs – The object to compare with

Returns

True if the data contained is equal

inline bool operator!=(const category &rhs) const

Unequality operator, returns true if rhs is not equal to this.

Parameters

rhs – The object to compare with

Returns

True if the data contained is not equal

inline reference front()

Return a reference to the first row in this category.

Returns

Reference to the first row in this category. The result is undefined if the category is empty.

inline const_reference front() const

Return a const reference to the first row in this category.

Returns

const reference to the first row in this category. The result is undefined if the category is empty.

inline reference back()

Return a reference to the last row in this category.

Returns

Reference to the last row in this category. The result is undefined if the category is empty.

inline const_reference back() const

Return a const reference to the last row in this category.

Returns

const reference to the last row in this category. The result is undefined if the category is empty.

inline iterator begin()

Return an iterator to the first row.

inline iterator end()

Return an iterator pointing past the last row.

inline const_iterator begin() const

Return a const iterator to the first row.

inline const_iterator end() const

Return a const iterator pointing past the last row.

inline const_iterator cbegin() const

Return a const iterator to the first row.

inline const_iterator cend() const

Return an iterator pointing past the last row.

inline size_t size() const

Return a count of the rows in this container.

inline size_t max_size() const

Return the theoretical maximum number or rows that can be stored.

inline bool empty() const

Return true if the category is empty.

row_handle operator[](const key_type &key)

Return a row_handle for the row specified by key.

Parameters

key – The value for the key, fields specified in the dictionary should have a value

Returns

The row found in the index, or an undefined row_handle

inline const row_handle operator[](const key_type &key) const

Return a const row_handle for the row specified by key.

Parameters

key – The value for the key, fields specified in the dictionary should have a value

Returns

The row found in the index, or an undefined row_handle

template<typename ...Ts, typename ...Ns>
inline iterator_proxy<const category, Ts...> rows(Ns... names) const

Return a special const iterator for all rows in this category. This iterator can be used in a structured binding context. E.g.:

for (const auto &[name, value] : cat.rows<std::string,int>("item_name", "item_value"))
  std::cout << name << ": " << value << '\n';
Template Parameters

Ts – The types for the columns requested

Parameters

names – The names for the columns requested

template<typename ...Ts, typename ...Ns>
inline iterator_proxy<category, Ts...> rows(Ns... names)

Return a special iterator for all rows in this category. This iterator can be used in a structured binding context. E.g.:

for (const auto &[name, value] : cat.rows<std::string,int>("item_name", "item_value"))
  std::cout << name << ": " << value << '\n';

// or in case we only need one column:

for (int id : cat.rows<int>("id"))
  std::cout << id << '\n';
Template Parameters

Ts – The types for the columns requested

Parameters

names – The names for the columns requested

inline conditional_iterator_proxy<category> find(condition &&cond)

Return a special iterator to loop over all rows that conform to cond.

for (row_handle rh : cat.find(cif::key("first_name") == "John" and cif::key("last_name") == "Doe"))
   .. // do something with rh
Parameters

cond – The condition for the query

Returns

A special iterator that loops over all elements that match. The iterator can be dereferenced to a row_handle

inline conditional_iterator_proxy<category> find(iterator pos, condition &&cond)

Return a special iterator to loop over all rows that conform to cond starting at pos.

Parameters
  • pos – Where to start searching

  • cond – The condition for the query

Returns

A special iterator that loops over all elements that match. The iterator can be dereferenced to a row_handle

inline conditional_iterator_proxy<const category> find(condition &&cond) const

Return a special const iterator to loop over all rows that conform to cond.

Parameters

cond – The condition for the query

Returns

A special iterator that loops over all elements that match. The iterator can be dereferenced to a const row_handle

inline conditional_iterator_proxy<const category> find(const_iterator pos, condition &&cond) const

Return a special const iterator to loop over all rows that conform to cond starting at pos.

Parameters
  • pos – Where to start searching

  • cond – The condition for the query

Returns

A special iterator that loops over all elements that match. The iterator can be dereferenced to a const row_handle

template<typename ...Ts, typename ...Ns>
inline conditional_iterator_proxy<category, Ts...> find(condition &&cond, Ns... names)

Return a special iterator to loop over all rows that conform to cond. The resulting iterator can be used in a structured binding context.

for (const auto &[name, value] : cat.find<std::string,int>(cif::key("item_value") > 10, "item_name", "item_value"))
   std::cout << name << ": " << value << '\n';
Parameters
  • cond – The condition for the query

  • names – The names for the columns requested

Template Parameters

Ts – The types for the columns requested

Returns

A special iterator that loops over all elements that match.

template<typename ...Ts, typename ...Ns>
inline conditional_iterator_proxy<const category, Ts...> find(condition &&cond, Ns... names) const

Return a special const iterator to loop over all rows that conform to cond. The resulting iterator can be used in a structured binding context.

Parameters
  • cond – The condition for the query

  • names – The names for the columns requested

Template Parameters

Ts – The types for the columns requested

Returns

A special iterator that loops over all elements that match.

template<typename ...Ts, typename ...Ns>
inline conditional_iterator_proxy<category, Ts...> find(const_iterator pos, condition &&cond, Ns... names)

Return a special iterator to loop over all rows that conform to cond starting at pos. The resulting iterator can be used in a structured binding context.

Parameters
  • pos – Iterator pointing to the location where to start

  • cond – The condition for the query

  • names – The names for the columns requested

Template Parameters

Ts – The types for the columns requested

Returns

A special iterator that loops over all elements that match.

template<typename ...Ts, typename ...Ns>
inline conditional_iterator_proxy<const category, Ts...> find(const_iterator pos, condition &&cond, Ns... names) const

Return a special const iterator to loop over all rows that conform to cond starting at pos. The resulting iterator can be used in a structured binding context.

Parameters
  • pos – Iterator pointing to the location where to start

  • cond – The condition for the query

  • names – The names for the columns requested

Template Parameters

Ts – The types for the columns requested

Returns

A special iterator that loops over all elements that match.

inline row_handle find1(condition &&cond)

Return the row handle for the row that matches cond Throws multiple_results_error if there are is not exactly one row matching cond.

Parameters

cond – The condition to search for

Returns

Row handle to the row found

inline row_handle find1(iterator pos, condition &&cond)

Return the row handle for the row that matches cond starting at pos Throws multiple_results_error if there are is not exactly one row matching cond.

Parameters
  • pos – The position to start the search

  • cond – The condition to search for

Returns

Row handle to the row found

inline const row_handle find1(condition &&cond) const

Return the const row handle for the row that matches cond Throws multiple_results_error if there are is not exactly one row matching cond.

Parameters

cond – The condition to search for

Returns

Row handle to the row found

inline const row_handle find1(const_iterator pos, condition &&cond) const

Return const the row handle for the row that matches cond starting at pos Throws multiple_results_error if there are is not exactly one row matching cond.

Parameters
  • pos – The position to start the search

  • cond – The condition to search for

Returns

Row handle to the row found

template<typename T>
inline T find1(condition &&cond, const char *column) const

Return value for the column named column for the single row that matches cond. Throws multiple_results_error if there are is not exactly one row.

Template Parameters

The – type to use for the result

Parameters
  • cond – The condition to search for

  • column – The name of the column to return the value for

Returns

The value found

template<typename T, std::enable_if_t<not is_optional_v<T>, int> = 0>
inline T find1(const_iterator pos, condition &&cond, const char *column) const

Return value for the column named column for the single row that matches cond when starting to search at pos. Throws multiple_results_error if there are is not exactly one row.

Template Parameters

The – type to use for the result

Parameters
  • pos – The location to start the search

  • cond – The condition to search for

  • column – The name of the column to return the value for

Returns

The value found

template<typename T, std::enable_if_t<is_optional_v<T>, int> = 0>
inline T find1(const_iterator pos, condition &&cond, const char *column) const

Return a value of type std::optional<T> for the column named column for the single row that matches cond when starting to search at pos. If the row was not found, an empty value is returned.

Template Parameters

The – type to use for the result

Parameters
  • pos – The location to start the search

  • cond – The condition to search for

  • column – The name of the column to return the value for

Returns

The value found, can be empty if no row matches the condition

template<typename ...Ts, typename ...Cs, typename U = std::enable_if_t<sizeof...(Ts) != 1>>
inline std::tuple<Ts...> find1(condition &&cond, Cs... columns) const

Return a std::tuple for the values for the columns named in columns for the single row that matches cond Throws multiple_results_error if there are is not exactly one row.

Template Parameters

The – types to use for the resulting tuple

Parameters
  • cond – The condition to search for

  • columns – The names of the columns to return the value for

Returns

The values found as a single tuple of type std::tuple<Ts…>

template<typename ...Ts, typename ...Cs, typename U = std::enable_if_t<sizeof...(Ts) != 1>>
inline std::tuple<Ts...> find1(const_iterator pos, condition &&cond, Cs... columns) const

Return a std::tuple for the values for the columns named in columns for the single row that matches cond when starting to search at pos Throws multiple_results_error if there are is not exactly one row.

Template Parameters

The – types to use for the resulting tuple

Parameters
  • pos – The location to start the search

  • cond – The condition to search for

  • columns – The names of the columns to return the value for

Returns

The values found as a single tuple of type std::tuple<Ts…>

inline row_handle find_first(condition &&cond)

Return a row handle to the first row that matches cond.

Parameters

cond – The condition to search for

Returns

The handle to the row that matches or an empty row_handle

inline row_handle find_first(iterator pos, condition &&cond)

Return a row handle to the first row that matches cond starting at pos.

Parameters
  • pos – The location to start searching

  • cond – The condition to search for

Returns

The handle to the row that matches or an empty row_handle

inline const row_handle find_first(condition &&cond) const

Return a const row handle to the first row that matches cond.

Parameters

cond – The condition to search for

Returns

The const handle to the row that matches or an empty row_handle

inline const row_handle find_first(const_iterator pos, condition &&cond) const

Return a const row handle to the first row that matches cond starting at pos.

Parameters
  • pos – The location to start searching

  • cond – The condition to search for

Returns

The const handle to the row that matches or an empty row_handle

template<typename T>
inline T find_first(condition &&cond, const char *column) const

Return the value for column column for the first row that matches condition cond.

Template Parameters

The – type of the value to return

Parameters
  • cond – The condition to search for

  • column – The column for which the value should be returned

Returns

The value found or a default constructed value if not found

template<typename T>
inline T find_first(const_iterator pos, condition &&cond, const char *column) const

Return the value for column column for the first row that matches condition cond when starting the search at pos.

Template Parameters

The – type of the value to return

Parameters
  • pos – The location to start searching

  • cond – The condition to search for

  • column – The column for which the value should be returned

Returns

The value found or a default constructed value if not found

template<typename ...Ts, typename ...Cs, typename U = std::enable_if_t<sizeof...(Ts) != 1>>
inline std::tuple<Ts...> find_first(condition &&cond, Cs... columns) const

Return a tuple containing the values for the columns columns for the first row that matches condition cond.

Template Parameters

The – types of the values to return

Parameters
  • cond – The condition to search for

  • columns – The columns for which the values should be returned

Returns

The values found or default constructed values if not found

template<typename ...Ts, typename ...Cs, typename U = std::enable_if_t<sizeof...(Ts) != 1>>
inline std::tuple<Ts...> find_first(const_iterator pos, condition &&cond, Cs... columns) const

Return a tuple containing the values for the columns columns for the first row that matches condition cond when starting the search at pos.

Template Parameters

The – types of the values to return

Parameters
  • pos – The location to start searching

  • cond – The condition to search for

  • columns – The columns for which the values should be returned

Returns

The values found or default constructed values if not found

template<typename T, std::enable_if_t<std::is_arithmetic_v<T>, int> = 0>
inline T find_max(const char *column, condition &&cond) const

Return the maximum value for column column for all rows that match condition cond.

Template Parameters

The – type of the value to return

Parameters
  • column – The column to use for the value

  • cond – The condition to search for

Returns

The value found or the minimal value for the type

template<typename T, std::enable_if_t<std::is_arithmetic_v<T>, int> = 0>
inline T find_max(const char *column) const

Return the maximum value for column column for all rows.

Template Parameters

The – type of the value to return

Parameters

column – The column to use for the value

Returns

The value found or the minimal value for the type

template<typename T, std::enable_if_t<std::is_arithmetic_v<T>, int> = 0>
inline T find_min(const char *column, condition &&cond) const

Return the minimum value for column column for all rows that match condition cond.

Template Parameters

The – type of the value to return

Parameters
  • column – The column to use for the value

  • cond – The condition to search for

Returns

The value found or the maximum value for the type

template<typename T, std::enable_if_t<std::is_arithmetic_v<T>, int> = 0>
inline T find_min(const char *column) const

Return the maximum value for column column for all rows.

Template Parameters

The – type of the value to return

Parameters

column – The column to use for the value

Returns

The value found or the maximum value for the type

inline bool exists(condition &&cond) const

Return whether a row exists that matches condition cond.

Parameters

cond – The condition to match

Returns

True if a row exists

inline size_t count(condition &&cond) const

Return the total number of rows that match condition cond.

Parameters

cond – The condition to match

Returns

The count

bool has_children(row_handle r) const

Using the relations defined in the validator, return whether the row in r has any children in other categories

bool has_parents(row_handle r) const

Using the relations defined in the validator, return whether the row in r has any parents in other categories

std::vector<row_handle> get_children(row_handle r, const category &childCat) const

Using the relations defined in the validator, return the row handles for all rows in childCat that are linked to row r

std::vector<row_handle> get_parents(row_handle r, const category &parentCat) const

Using the relations defined in the validator, return the row handles for all rows in parentCat that are linked to row r

std::vector<row_handle> get_linked(row_handle r, const category &cat) const

Using the relations defined in the validator, return the row handles for all rows in cat that are in any way linked to row r

iterator erase(iterator pos)

Erase the row pointed to by pos and return the iterator to the row following pos.

inline void erase(row_handle rh)

Erase row rh.

size_t erase(condition &&cond)

Erase all rows that match condition cond.

Parameters

cond – The condition

Returns

The number of rows that have been erased

size_t erase(condition &&cond, std::function<void(row_handle)> &&visit)

Erase all rows that match condition cond calling the visitor function visit for each before actually erasing it.

Parameters
  • cond – The condition

  • visit – The visitor function

Returns

The number of rows that have been erased

inline iterator emplace(row_initializer &&ri)

Emplace the values in ri in a new row.

Parameters

ri – An object containing the values to insert

Returns

iterator to the newly created row

template<typename ItemIter>
inline iterator emplace(ItemIter b, ItemIter e)

Create a new row and emplace the values in the range b to e in it.

Parameters
  • b – Iterator to the beginning of the range of item_value

  • e – Iterator to the end of the range of item_value

Returns

iterator to the newly created row

void clear()

Completely erase all rows contained in this category.

std::string get_unique_id(std::function<std::string(int)> generator = cif::cif_id_for_number)

generate a new, unique ID. Pass it an ID generating function based on a sequence number. This function will be called until the result is unique in the context of this category

inline std::string get_unique_id(const std::string &prefix)

Generate a new, unique ID based on a string prefix followed by a number.

Parameters

prefix – The string prefix

Returns

a new unique ID

inline void update_value(condition &&cond, std::string_view tag, std::string_view value)

Update a single column named tag in the rows that match cond to value value making sure the linked categories are updated according to the link. That means, child categories are updated if the links are absolute and unique. If they are not, the child category rows are split.

void update_value(const std::vector<row_handle> &rows, std::string_view tag, std::string_view value)

Update a single column named tag in rows to value value making sure the linked categories are updated according to the link. That means, child categories are updated if the links are absolute and unique. If they are not, the child category rows are split.

inline uint16_t get_column_ix(std::string_view column_name) const

Return the index number for column_name.

inline std::string_view get_column_name(uint16_t ix) const

Return the name for column with index ix.

Parameters

ix – The index number

Returns

The name of the column

inline uint16_t add_column(std::string_view column_name)

Make sure a column with name column_name is known and return its index number.

Parameters

column_name – The name of the column

Returns

The index number of the column

inline bool has_column(std::string_view name) const

Return whether a column with name name exists in this category.

Parameters

name – The name of the column

Returns

True if the column exists

iset get_columns() const

Return the cif::iset of columns in this category.

void sort(std::function<int(row_handle, row_handle)> f)

Sort the rows using comparator function f.

Parameters

f – The comparator function taking two row_handles and returning an int indicating whether the first is smaller, equal or larger than the second. ( respectively a value <0, 0, or >0 )

void reorder_by_index()

Reorder the rows in the category using the index defined by the category_validator.

std::vector<std::string> get_tag_order() const

This function returns effectively the list of fully qualified column names, that is category_name + ‘.’ + column_name for each column

void write(std::ostream &os) const

Write the contents of the category to the std::ostream os.

void write(std::ostream &os, const std::vector<std::string> &order, bool addMissingColumns = true)

Write the contents of the category to the std::ostream os and use order as the order of the columns. If addMissingColumns is false, columns that do not contain any value will be suppressed.

Parameters
  • os – The std::ostream to write to

  • order – The order in which the columns should appear

  • addMissingColumns – When false, empty columns are suppressed from the output

Friends

inline friend std::ostream &operator<<(std::ostream &os, const category &cat)

friend function to make it possible to do:

std::cout << my_category;